Skip to content
This repository was archived by the owner on Apr 28, 2025. It is now read-only.

Commit 7b783e1

Browse files
committed
Make sure to sort dataframes
Since database queries aren't necessarily in the same order every time the results need to be sorted to compare them
1 parent 7304a88 commit 7b783e1

1 file changed

Lines changed: 25 additions & 5 deletions

File tree

02_Intermediate_TAP_Query.ipynb

Lines changed: 25 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -87,6 +87,26 @@
8787
"warnings.filterwarnings('ignore')"
8888
]
8989
},
90+
{
91+
"cell_type": "markdown",
92+
"metadata": {},
93+
"source": [
94+
"In general the order of results from database queries cannot be assumed to be the same every time.\n",
95+
"This function sorts the data so we can compare the result dataframes even if the records are not in the same order from the query."
96+
]
97+
},
98+
{
99+
"cell_type": "code",
100+
"execution_count": null,
101+
"metadata": {},
102+
"outputs": [],
103+
"source": [
104+
"def sort_dataframe(df, sort_key='objectid'):\n",
105+
" df = df.sort_values('objectId')\n",
106+
" df.set_index(np.array(range(len(df))), inplace=True) # Since we are sorting, we need to reset the incremental index as well\n",
107+
" return df"
108+
]
109+
},
90110
{
91111
"cell_type": "markdown",
92112
"metadata": {},
@@ -303,8 +323,8 @@
303323
"source": [
304324
"# Convert the results to pandas data frames and assert that the\n",
305325
"# contents of the two tables are identical\n",
306-
"assert_frame_equal(results.to_table().to_pandas(),\n",
307-
" results1.to_table().to_pandas())"
326+
"assert_frame_equal(sort_dataframe(results.to_table().to_pandas()),\n",
327+
" sort_dataframe(results1.to_table().to_pandas()))"
308328
]
309329
},
310330
{
@@ -827,7 +847,7 @@
827847
"# Assert that the results are the same as obtained from\n",
828848
"# executing synchronous queries\n",
829849
"assert len(async_results) == 14424 \n",
830-
"assert_frame_equal(results, async_results.to_table().to_pandas())"
850+
"assert_frame_equal(sort_dataframe(results), sort_dataframe(async_results.to_table().to_pandas()))"
831851
]
832852
},
833853
{
@@ -848,8 +868,8 @@
848868
"retrieved_job = retrieve_query(job.url)\n",
849869
"retrieved_results = retrieved_job.fetch_result()\n",
850870
"assert len(retrieved_results) == 14424\n",
851-
"assert_frame_equal(retrieved_results.to_table().to_pandas(),\n",
852-
" async_results.to_table().to_pandas())"
871+
"assert_frame_equal(sort_dataframe(retrieved_results.to_table().to_pandas()),\n",
872+
" sort_dataframe(async_results.to_table().to_pandas()))"
853873
]
854874
},
855875
{

0 commit comments

Comments
 (0)