Computer Vision Project (Ongoing) – Structural Drawing Interpretation Using Deep Learning
Image is AI-generated.
Summary
Developing a computer vision pipeline to extract structural elements directly from engineering drawings. The project focuses on identifying beams, columns, and structural walls in plan drawings and producing instance-level masks that preserve geometry and connectivity. Rather than simply locating objects, the goal is to generate structured outputs that could support automated modelling or quantity extraction workflows.
I am building custom datasets from real structural plans, training deep convolutional models, and evaluating performance using precision–recall and IoU metrics. The longer-term objective is to move from image recognition toward usable structural data.
Problem Context
Structural drawings are visually clean for engineers but complex for machines. Elements are defined by linework, overlap with text and dimensions, and vary across drafting styles. The challenge is not detecting shapes in isolation, but correctly identifying structural members while preserving member thickness, orientation, continuity across intersections, and connectivity between beams and columns. If the output cannot retain geometry, it has limited engineering value.
Dataset Construction
Datasets are constructed from structural plan drawings by converting vector PDFs to high-resolution raster images, cropping into tiled patches to maintain detail, and manually annotating instance-level masks for beams, columns, and walls. Instance-level labelling allows each structural member to be treated as a separate object rather than part of a single class mask — important for later extracting centrelines and connectivity.
Augmentation (rotational, scaling, brightness variation, and controlled noise injection) is applied to ensure the model does not overfit to a single drafting style.
Model Approach
The current pipeline uses an instance-aware convolutional architecture that produces a class label, bounding region, and per-instance segmentation mask for each structural member — allowing each to be isolated while retaining its exact geometry.
From the mask outputs, post-processing extracts centrelines, estimates member thickness, identifies intersections between elements, and cleans up spurious detections. This geometry-aware stage is essential: a bounding box alone is insufficient for structural interpretation.
Training & Evaluation
Training is conducted in PyTorch with stratified train/validation/test splits, cross-entropy and mask loss components, tuned confidence thresholds, and IoU-based mask evaluation. Performance is assessed via precision–recall curves, mean Average Precision, and Intersection over Union.
Rather than overall accuracy, I track failure modes: thin members being partially detected, columns merging with beam masks, and annotations incorrectly classified as structure. These inform both dataset refinement and model adjustments.
Engineering Direction
The emphasis is structural usefulness, not just image classification. Key questions guiding development: Can extracted masks be converted into clean centrelines? Can beam–column joints be detected reliably? Can the output feed into a simplified structural model? The work is gradually moving from detection toward structured geometric representation.
Code & Repository
The full source code is not published here as it forms part of an ongoing supervised research project and includes structural drawing datasets provided under restricted use agreements. The implementation includes custom data loaders and image tiling pipelines, PyTorch training and inference scripts, evaluation scripts for precision–recall and IoU, and post-processing routines for geometry extraction.
Key Skills & Tools
Computer Vision
- Instance-level segmentation
- Mask-based structural extraction
- Morphological image processing
- Geometric post-processing
Machine Learning
- PyTorch
- Custom dataset construction and annotation
- Precision–recall and IoU evaluation
- Hyperparameter tuning
- Failure mode analysis
Engineering Application
- Geometry-aware model design
- Structural connectivity extraction
- Bridging vision outputs to engineering use
- Ongoing supervised research project