|
392 | 392 | "\n", |
393 | 393 | "The following examples show how to query for data sets that include a desired coordinate and observation date.\n", |
394 | 394 | "\n", |
395 | | - "Above, we can see that for visit 971990, the (RA,Dec) are (70.37770,-37.1757) and the observation date is 20251201\n", |
| 395 | + "##### Temporal queries\n", |
| 396 | + "\n", |
| 397 | + "Above, we can see that for visit 971990, the (RA,Dec) are (70.37770,-37.1757) and the observation date is 20251201.\n", |
396 | 398 | "But these are just human-readable summaries of the more precise spatial and temporal information stored in the registry, which are represented in Python by `Timespan` and `Region` objects, respectively.\n", |
397 | | - "`DimensionRecord` objects that represent spatial or temporal concepts (a `visit` is both) have these objects attached to them:" |
| 399 | + "`DimensionRecord` objects that represent spatial or temporal concepts (a `visit` is both) have these objects attached to them.\n", |
| 400 | + "\n", |
| 401 | + "Retrieve the `DimensionRecord` for a visit and show its timespan and region." |
398 | 402 | ] |
399 | 403 | }, |
400 | 404 | { |
|
404 | 408 | "outputs": [], |
405 | 409 | "source": [ |
406 | 410 | "(record,) = registry.queryDimensionRecords('visit', visit=971990)\n", |
| 411 | + "\n", |
407 | 412 | "print(record.timespan)\n", |
| 413 | + "print(' ')\n", |
408 | 414 | "print(record.region)" |
409 | 415 | ] |
410 | 416 | }, |
411 | 417 | { |
412 | 418 | "cell_type": "markdown", |
413 | 419 | "metadata": {}, |
414 | 420 | "source": [ |
415 | | - "If the timespan or spatial region that's being used as a query constraint is already associated with a data ID in the database, spatial and temporal overlap constraints are automatic.\n", |
| 421 | + "If the timespan or spatial region that are being used as query constraints are already associated with a data ID in the database, the spatial and temporal overlap constraints are automatic.\n", |
416 | 422 | "For example, if we query for `deepCoadd` datasets with a `visit`+`detector` data ID, we'll get just the ones that overlap that observation and have the same band (because a visit implies a band):" |
417 | 423 | ] |
418 | 424 | }, |
|
430 | 436 | "cell_type": "markdown", |
431 | 437 | "metadata": {}, |
432 | 438 | "source": [ |
433 | | - "To query for dimension records or datasets that overlap an arbitrary time range, we can use the `bind` argument to pass times through to `where`; we'll use this to look for visits within one minute of this one on either side:" |
| 439 | + "To query for dimension records or datasets that overlap an arbitrary time range, we can use the `bind` argument to pass times through to `where`.\n", |
| 440 | + "Use this, along with [astropy.time](https://docs.astropy.org/en/stable/time/index.html), to look for visits within one minute of this one on either side." |
434 | 441 | ] |
435 | 442 | }, |
436 | 443 | { |
|
454 | 461 | "Using `bind` to define an alias for a variable saves us from having to string-format the times into the `where` expression.\n", |
455 | 462 | "Unfortunately, there is a bug in `queryDatasets` that prevents `bind` from working there (fixed in `w_2021_25`).\n", |
456 | 463 | "\n", |
457 | | - "A `Timespan` can have a `begin` or `end` of `None` if it is unbounded on that side." |
| 464 | + "Note that a `dafButler.Timespan` will accept a `begin` or `end` value that is equal to `None` if it is unbounded on that side." |
458 | 465 | ] |
459 | 466 | }, |
460 | 467 | { |
461 | 468 | "cell_type": "markdown", |
462 | 469 | "metadata": {}, |
463 | 470 | "source": [ |
464 | | - "Arbitrary spatial queries are not supported, but we do have set of dimensions that correspond to different levels of the HTM (hierarchical triangular mesh) pixelization of the sky.\n", |
| 471 | + "##### Spatial queries\n", |
465 | 472 | "\n", |
466 | | - "So one can transform a region or point into one or more HTM IDs, and then seach using that as a spatial data ID.\n", |
467 | | - "The `lsst.sphgeom` library is what backs our region objects, and we can also use it to find the HTM ID for a point." |
| 473 | + "Arbitrary spatial queries are not supported, but we do have set of dimensions that correspond to different levels of the HTM (hierarchical triangular mesh) pixelization of the sky ([HTM primer](http://www.skyserver.org/htm/)).\n", |
| 474 | + "The process is to transform a region or point into one or more HTM identifiers (HTM IDs), and then create a query using the HTM ID as the spatial data ID.\n", |
| 475 | + "The `lsst.sphgeom` library supports region objects and HTM pixelization in the LSST Science Pipelines.\n", |
| 476 | + "\n", |
| 477 | + "Import the `lsst.sphgeom` package, initialize a sky pixelization to level 10 (the level at which one sky pixel is about five arcmin across), and find the HTM ID for a desired sky coordinate." |
468 | 478 | ] |
469 | 479 | }, |
470 | 480 | { |
|
475 | 485 | "source": [ |
476 | 486 | "import lsst.sphgeom\n", |
477 | 487 | "\n", |
478 | | - "pixelization = lsst.sphgeom.HtmPixelization(7)" |
| 488 | + "pixelization = lsst.sphgeom.HtmPixelization(10)" |
479 | 489 | ] |
480 | 490 | }, |
481 | 491 | { |
|
486 | 496 | "source": [ |
487 | 497 | "htm_id = pixelization.index(\n", |
488 | 498 | " lsst.sphgeom.UnitVector3d(\n", |
489 | | - " lsst.sphgeom.LonLat.fromDegrees(70.37699524983329, -37.17573628348882)\n", |
| 499 | + " lsst.sphgeom.LonLat.fromDegrees(70.376995, -37.175736)\n", |
490 | 500 | " )\n", |
491 | 501 | ")\n", |
| 502 | + "\n", |
| 503 | + "# Obtain and print the scale to provide a sense of the size of the sky pixelization being used\n", |
492 | 504 | "scale = pixelization.triangle(htm_id).getBoundingCircle().getOpeningAngle().asDegrees()*3600\n", |
493 | 505 | "print(f'HTM ID={htm_id} at level={pixelization.getLevel()} is a ~{scale:0.2}\" triangle.')" |
494 | 506 | ] |
495 | 507 | }, |
| 508 | + { |
| 509 | + "cell_type": "code", |
| 510 | + "execution_count": null, |
| 511 | + "metadata": {}, |
| 512 | + "outputs": [], |
| 513 | + "source": [ |
| 514 | + "visits = registry.queryDimensionRecords(\"visit\", htm20=htm_id,\n", |
| 515 | + " where=\"visit.timespan OVERLAPS my_timespan\",\n", |
| 516 | + " bind={\"my_timespan\": timespan})\n", |
| 517 | + "exposures = registry.queryDimensionRecords(\"exposure\", htm20=htm_id,\n", |
| 518 | + " where=\"visit.timespan OVERLAPS my_timespan\",\n", |
| 519 | + " bind={\"my_timespan\": timespan})\n", |
| 520 | + "detectors = registry.queryDimensionRecords(\"detector\", htm20=htm_id,\n", |
| 521 | + " where=\"visit.timespan OVERLAPS my_timespan\",\n", |
| 522 | + " bind={\"my_timespan\": timespan})\n", |
| 523 | + "\n", |
| 524 | + "for visit, exposure, detector in zip(visits, exposures, detectors):\n", |
| 525 | + " print(visit.id, visit.timespan, visit.physical_filter, exposure.tracking_ra, exposure.tracking_dec, detector.id)" |
| 526 | + ] |
| 527 | + }, |
496 | 528 | { |
497 | 529 | "cell_type": "markdown", |
498 | 530 | "metadata": {}, |
499 | 531 | "source": [ |
500 | | - "And we can use that to query for (e.g.) the set of all `src` data products that overlap this point in i:" |
| 532 | + "Thus, with the above query, we have uniquely identified the visit and detector for our desired temporal and spatial constraints.\n", |
| 533 | + "Note that if a smaller HTM level is used (like 7), which is a larger sky pixel (~2200 arcseconds), the above query will return many more visits and detectors which overlap with that larger region. Try it and see!\n", |
| 534 | + "\n", |
| 535 | + "Note that queries using the HTM ID can also be used to, e.g., find the set of all `src` catalog data products that overlap this point in i." |
501 | 536 | ] |
502 | 537 | }, |
503 | 538 | { |
|
516 | 551 | "cell_type": "markdown", |
517 | 552 | "metadata": {}, |
518 | 553 | "source": [ |
| 554 | + "Why is does that search take tens of seconds?\n", |
519 | 555 | "The butler's spatial reasoning is designed to work well for regions the size of full data products, like detector- or patch-level images and catalogs, and it's a poor choice for object-scale searches.\n", |
520 | | - "The query above is slow in large part because it actually searches for all `src` datasets that overlap the much larger htm7 pixel (about a degree on a side), and then filters the results down to the htm20 pixel in Python." |
521 | | - ] |
522 | | - }, |
523 | | - { |
524 | | - "cell_type": "markdown", |
525 | | - "metadata": {}, |
526 | | - "source": [ |
527 | | - "That said, it's something we'll use frequently below, so it will be useful to wrap this in a function:" |
| 556 | + "The above search is slow in part because `queryDatasets` searches for all `src` datasets that overlap a larger region and then filters the results down to the specified HTM ID pixel.\n", |
| 557 | + "\n", |
| 558 | + "Options for exploring and retrieving catalog data with the Butler is covered in more depth in Section 5." |
528 | 559 | ] |
529 | 560 | }, |
530 | 561 | { |
|
0 commit comments