Skip to content

For all SDV demo datasets, make sure the metadata lists columns in the same order as the data #2821

@npatki

Description

@npatki

Problem Description

The metadata is meant to accurately reflect all the tables and columns in the dataset -- this includes the order that the columns appear within each table (from left to right). If there is a mismatch in column order between the data and metadata, there may be some inconsistencies down-the-line when we create synthetic data. See #2803, as an example.

Expected behavior

There are many demo datasets that are available for multi-table, single-table, and sequential. For each of these datasets, the metadata should list a table's column in the same order that they appear in the data itself (from left to right).

If there is a mismatch, then I expect that the metadata should be updated (in our S3 buckets) to match the order of the data tables (from left to right).

Additional context

Before Python 3.7, dictionaries (such as metadata) were considered to be unordered. Since we created and saved metadata dictionaries several years ago, this is probably why the columns are listed out-of-order in the metadata dict.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions