Taking a Retrieval-Augmented Generation (RAG) solution from demo to full-scale production is a long and…
How To Flatten Data
JSON data represents data as a collection of key-value pairs, where the keys are strings and the values can be strings, numbers, objects (another JSON object), arrays, boolean values, or null. One of the most common transformations in an ETL pipeline pulling JSON data is to flatten nested JSON objects into a flat format to load into databases and data warehouses. Nexla makes flattening nested JSON as easy as copying a few lines of code into a low-code Python transformation.
Why is JSON such a popular format for storing data?
JSON has a simple, human-readable syntax, making it easy for developers to work with. It is lightweight and has a small size compared to other data formats such as XML. Most APIs return data as JSON, making it a common data format in ETL pipelines.
What is nested JSON?
Nested JSON allows for more complex data structures to be represented in a single JSON document. One JSON object can contain multiple key-value pairs, where values can be of any data type, including another JSON object. Nested JSON is used to relate JSON objects to one another in, for example a customer JSON object may have an array of products purchased stored as nested objects.
For example:
{ "name": "John Doe", "address": { "street": "123 Main St", "city": "San Francisco", "state": "CA", "zip": "94105" }, "phone": "555-555-1212" }
In this example, the address key has a value that is another JSON object, containing its own keys and values.
What is flattening JSON?
Flattening nested JSON refers to the process of transforming a nested JSON data structure into a flat data structure, where all the nested objects and arrays are transformed into a single level of key-value pairs. Flat data structures are easier to process and analyze, as there are no nested objects to traverse. Flattening JSON is often a necessary step when loading data into many popular databases and data warehouses.
How to flatten data using Nexla
If you have a Nexset containing nested JSON that you want to flatten, the first step is to transform that Nexset, add a rule group, and select Transform: Code.
Nexla will display a code editor to write low-code transformations in Python or Javascript; in this example we’ll be using Python.
Copy the below code which calls the built-in Nexla function flatten on the input parameter, assigns the returned value to the output dictionary, then returns that as the transformed, flattened row to generate the resulting Nexset. You can replace the underscore input with any character you want to use as delimiter when flattening nested JSON objects.
def transform(input, metadata, args): # Sample python transform to pass all attributes through output = {} output = nexla_fn.call("flatten", "_", input) return output
After you are happy with the code, press Run Nexset Rules to see the output of your low-code transformation and your new flattened Nexset.
Now that you have flattened your input data, you can send it to any destination as a flat data structure.
Nexla makes one of the most common data transformations, a simple, intuitive process that even pseudo-technical users can take advantage of. Nexla can extract complicated JSON data structures, flatten them, and send them to their desired destination quickly, so you can save valuable engineering cycles to focus on more impactful work.
Unify your data operations today!
Discover how Nexla’s powerful data operations can put an end to your data challenges with our free demo.