Skip to content
This repository was archived by the owner on Apr 28, 2025. It is now read-only.

Commit 1e063be

Browse files
committed
MLG updates to JB input
1 parent 43e883e commit 1e063be

1 file changed

Lines changed: 50 additions & 19 deletions

File tree

04_Intro_to_Butler.ipynb

Lines changed: 50 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -392,9 +392,13 @@
392392
"\n",
393393
"The following examples show how to query for data sets that include a desired coordinate and observation date.\n",
394394
"\n",
395-
"Above, we can see that for visit 971990, the (RA,Dec) are (70.37770,-37.1757) and the observation date is 20251201\n",
395+
"##### Temporal queries\n",
396+
"\n",
397+
"Above, we can see that for visit 971990, the (RA,Dec) are (70.37770,-37.1757) and the observation date is 20251201.\n",
396398
"But these are just human-readable summaries of the more precise spatial and temporal information stored in the registry, which are represented in Python by `Timespan` and `Region` objects, respectively.\n",
397-
"`DimensionRecord` objects that represent spatial or temporal concepts (a `visit` is both) have these objects attached to them:"
399+
"`DimensionRecord` objects that represent spatial or temporal concepts (a `visit` is both) have these objects attached to them.\n",
400+
"\n",
401+
"Retrieve the `DimensionRecord` for a visit and show its timespan and region."
398402
]
399403
},
400404
{
@@ -404,15 +408,17 @@
404408
"outputs": [],
405409
"source": [
406410
"(record,) = registry.queryDimensionRecords('visit', visit=971990)\n",
411+
"\n",
407412
"print(record.timespan)\n",
413+
"print(' ')\n",
408414
"print(record.region)"
409415
]
410416
},
411417
{
412418
"cell_type": "markdown",
413419
"metadata": {},
414420
"source": [
415-
"If the timespan or spatial region that's being used as a query constraint is already associated with a data ID in the database, spatial and temporal overlap constraints are automatic.\n",
421+
"If the timespan or spatial region that are being used as query constraints are already associated with a data ID in the database, the spatial and temporal overlap constraints are automatic.\n",
416422
"For example, if we query for `deepCoadd` datasets with a `visit`+`detector` data ID, we'll get just the ones that overlap that observation and have the same band (because a visit implies a band):"
417423
]
418424
},
@@ -430,7 +436,8 @@
430436
"cell_type": "markdown",
431437
"metadata": {},
432438
"source": [
433-
"To query for dimension records or datasets that overlap an arbitrary time range, we can use the `bind` argument to pass times through to `where`; we'll use this to look for visits within one minute of this one on either side:"
439+
"To query for dimension records or datasets that overlap an arbitrary time range, we can use the `bind` argument to pass times through to `where`.\n",
440+
"Use this, along with [astropy.time](https://docs.astropy.org/en/stable/time/index.html), to look for visits within one minute of this one on either side."
434441
]
435442
},
436443
{
@@ -454,17 +461,20 @@
454461
"Using `bind` to define an alias for a variable saves us from having to string-format the times into the `where` expression.\n",
455462
"Unfortunately, there is a bug in `queryDatasets` that prevents `bind` from working there (fixed in `w_2021_25`).\n",
456463
"\n",
457-
"A `Timespan` can have a `begin` or `end` of `None` if it is unbounded on that side."
464+
"Note that a `dafButler.Timespan` will accept a `begin` or `end` value that is equal to `None` if it is unbounded on that side."
458465
]
459466
},
460467
{
461468
"cell_type": "markdown",
462469
"metadata": {},
463470
"source": [
464-
"Arbitrary spatial queries are not supported, but we do have set of dimensions that correspond to different levels of the HTM (hierarchical triangular mesh) pixelization of the sky.\n",
471+
"##### Spatial queries\n",
465472
"\n",
466-
"So one can transform a region or point into one or more HTM IDs, and then seach using that as a spatial data ID.\n",
467-
"The `lsst.sphgeom` library is what backs our region objects, and we can also use it to find the HTM ID for a point."
473+
"Arbitrary spatial queries are not supported, but we do have set of dimensions that correspond to different levels of the HTM (hierarchical triangular mesh) pixelization of the sky ([HTM primer](http://www.skyserver.org/htm/)).\n",
474+
"The process is to transform a region or point into one or more HTM identifiers (HTM IDs), and then create a query using the HTM ID as the spatial data ID.\n",
475+
"The `lsst.sphgeom` library supports region objects and HTM pixelization in the LSST Science Pipelines.\n",
476+
"\n",
477+
"Import the `lsst.sphgeom` package, initialize a sky pixelization to level 10 (the level at which one sky pixel is about five arcmin across), and find the HTM ID for a desired sky coordinate."
468478
]
469479
},
470480
{
@@ -475,7 +485,7 @@
475485
"source": [
476486
"import lsst.sphgeom\n",
477487
"\n",
478-
"pixelization = lsst.sphgeom.HtmPixelization(7)"
488+
"pixelization = lsst.sphgeom.HtmPixelization(10)"
479489
]
480490
},
481491
{
@@ -486,18 +496,43 @@
486496
"source": [
487497
"htm_id = pixelization.index(\n",
488498
" lsst.sphgeom.UnitVector3d(\n",
489-
" lsst.sphgeom.LonLat.fromDegrees(70.37699524983329, -37.17573628348882)\n",
499+
" lsst.sphgeom.LonLat.fromDegrees(70.376995, -37.175736)\n",
490500
" )\n",
491501
")\n",
502+
"\n",
503+
"# Obtain and print the scale to provide a sense of the size of the sky pixelization being used\n",
492504
"scale = pixelization.triangle(htm_id).getBoundingCircle().getOpeningAngle().asDegrees()*3600\n",
493505
"print(f'HTM ID={htm_id} at level={pixelization.getLevel()} is a ~{scale:0.2}\" triangle.')"
494506
]
495507
},
508+
{
509+
"cell_type": "code",
510+
"execution_count": null,
511+
"metadata": {},
512+
"outputs": [],
513+
"source": [
514+
"visits = registry.queryDimensionRecords(\"visit\", htm20=htm_id,\n",
515+
" where=\"visit.timespan OVERLAPS my_timespan\",\n",
516+
" bind={\"my_timespan\": timespan})\n",
517+
"exposures = registry.queryDimensionRecords(\"exposure\", htm20=htm_id,\n",
518+
" where=\"visit.timespan OVERLAPS my_timespan\",\n",
519+
" bind={\"my_timespan\": timespan})\n",
520+
"detectors = registry.queryDimensionRecords(\"detector\", htm20=htm_id,\n",
521+
" where=\"visit.timespan OVERLAPS my_timespan\",\n",
522+
" bind={\"my_timespan\": timespan})\n",
523+
"\n",
524+
"for visit, exposure, detector in zip(visits, exposures, detectors):\n",
525+
" print(visit.id, visit.timespan, visit.physical_filter, exposure.tracking_ra, exposure.tracking_dec, detector.id)"
526+
]
527+
},
496528
{
497529
"cell_type": "markdown",
498530
"metadata": {},
499531
"source": [
500-
"And we can use that to query for (e.g.) the set of all `src` data products that overlap this point in i:"
532+
"Thus, with the above query, we have uniquely identified the visit and detector for our desired temporal and spatial constraints.\n",
533+
"Note that if a smaller HTM level is used (like 7), which is a larger sky pixel (~2200 arcseconds), the above query will return many more visits and detectors which overlap with that larger region. Try it and see!\n",
534+
"\n",
535+
"Note that queries using the HTM ID can also be used to, e.g., find the set of all `src` catalog data products that overlap this point in i."
501536
]
502537
},
503538
{
@@ -516,15 +551,11 @@
516551
"cell_type": "markdown",
517552
"metadata": {},
518553
"source": [
554+
"Why is does that search take tens of seconds?\n",
519555
"The butler's spatial reasoning is designed to work well for regions the size of full data products, like detector- or patch-level images and catalogs, and it's a poor choice for object-scale searches.\n",
520-
"The query above is slow in large part because it actually searches for all `src` datasets that overlap the much larger htm7 pixel (about a degree on a side), and then filters the results down to the htm20 pixel in Python."
521-
]
522-
},
523-
{
524-
"cell_type": "markdown",
525-
"metadata": {},
526-
"source": [
527-
"That said, it's something we'll use frequently below, so it will be useful to wrap this in a function:"
556+
"The above search is slow in part because `queryDatasets` searches for all `src` datasets that overlap a larger region and then filters the results down to the specified HTM ID pixel.\n",
557+
"\n",
558+
"Options for exploring and retrieving catalog data with the Butler is covered in more depth in Section 5."
528559
]
529560
},
530561
{

0 commit comments

Comments
 (0)