Validate Data – Introduction
Validation has two purposes in BIM data processing. First, your data analysis and insights are as good as the quality of your data. Validation is one way to ensure that the quality is present. Second, creating 100% accurate, consistent, and complete data is hard. However, this is not a problem as long as you know what is missing or which data points might have problems. Validation can help here too.
There are two places in the data processing workflow where you might want to validate your data.
First, validating the raw data before it goes into the data processing makes sure that it can be processed with your dataflows.
Second, validating data after the data processing makes sure that everything went as planned, in other words, that the automated processing was successful, and the data you produced is reliable and usable.
Note, we are not talking about checking the design or design coordination here. That is a topic of its own.
Different Ways to Validate Data
There are several ways to validate data. You don’t always need to use all of them.
Presence check (mandatory constrain)
To process data or use it for a specific purpose, certain data needs to be present, it is mandatory. In Simplebim, you can check:
- That the mandatory objects are in the model, like space objects for the energy analysis.
- That some properties are in the model and have values. They are not allowed to be empty. For example, space type.
- That mandatory groups are in the model. These can be groups of objects used in the processing, or for example, classification system items.
Allowed value check (set member constrain)
Automated analysis and processing require that property values are consistent and normalized. In Simplebim you can check that the values for properties have allowed values. For example:
- Rule based grouping/classification of objects, requires that the type identifiers are always the same.
- Comparing the model to the room schedule requires that the model has the same space names as the room schedule.
- To automatically connect BIM data to other data sources, generated (so called) foreign-keys must be as expected.
Format check
Some text values must have a specific pattern in them. In Simplebim you can check the format of the property values. For example:
- Building storeys must be named after a naming scheme, e.g. that the name starts with two numbers, followed by a space and any number of alphanumerical characters.
- In classification systems, the codes are often structured in a very specific way. In Uniclass, this could be something like SL_45_10_49.
Range check
Many times, numbers, measures, or dates should be within a certain range. That is, they have minimum and/or maximum allowed values. Or, for example, measure values cannot be negative. Or volumes of objects cannot be larger than some reasonable value. In Simplebim, you can check that the values are in an allowed range.
Uniqueness check
You can check if a property, or a combination of property values are unique. For example, the GUIDs (Global Unique Identifier) must always be unique for every object in the model.
Consistency checks
You can check that the data is logical. For example:
- The net area of an object cannot be larger than the bounding box area.
- If the material of an object is concrete, then it also needs to have frost resistance defined.
Data type checks
You can make sure that property values are of the required type. For example, values in a property must be string, boolean, numeric (integer or real), date or specific measure value type, like length, area or volume.
Formating and Using the Results
There are several ways to communicate and use validation results.
You can simply add a property to the objects and set the result to that property with specific values. This way anybody who uses the data will know, if there’s some problems found, and take that into account in their analysis or reports.
You can also create groups of objects for each issue found. This can also be a good way to visualize the issues in Simplebim or downstream applications.
Some issues can be tolerated, some issues can be automatically fixed with data processing. However some data issues must be communicated back to the model author. This is where for example the BCF standard can become useful.
Next Steps
If you followed the data processing documentation in order you know have a good idea how to explore, structure, clean, enrich and validate data in an open, centralized and automated way. The final step in a basic data processing workflow is publishing the data for further use. Read more about that here.