Skip to content

Fix problems with tools#145

Open
mkavulich wants to merge 16 commits intoESCOMP:mainfrom
mkavulich:feature/fix_tools
Open

Fix problems with tools#145
mkavulich wants to merge 16 commits intoESCOMP:mainfrom
mkavulich:feature/fix_tools

Conversation

@mkavulich
Copy link
Copy Markdown
Collaborator

@mkavulich mkavulich commented Apr 13, 2026

Description

In the process of trying to add a new tool to check for redundant CF name elements (see #143), I found that our existing tools suffer from a number of issues:

This PR resolves those problems. Also, I took the opportunity to clean up the existing tools:

  • Make all python scripts pass pylint with a 10/10 score
  • Standardized the format of comments and multi-line strings
  • Standardized how command-line arguments are named and referenced
  • Removed many unnecessary comments

And finally, I added the check for duplicate cfname elements that was the start of this whole process.

Issues

Resolves #118, #144

Copy link
Copy Markdown
Collaborator Author

@mkavulich mkavulich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Including some comments to make reviews easier

Comment thread tools/lib/xml_tools.py
return result

###############################################################################
def find_schema_file(schema_root, schema_path=None):
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: this tool is no longer needed since we no longer have the schema version in the filename (#142)

Comment thread tools/lib/xml_tools.py
###############################################################################
"""Read the XML file, <filename>, and return its tree and root"""
if os.path.isfile(filename) and os.access(filename, os.R_OK):
file_open = (lambda x: open(x, 'r'))
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic was simplified as a suggestion from pylint

#Function to extract standard names from element tree root
#+++++++++++++++++++++++++++++++++++++++++++++++++++++++++

def get_dict_stdnames(xml_tree_root):
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed this function to get_standard_names_as_set to avoid confusion (it returns a set, not a dict)


#+++++++++++++++++++++++++++++++++++++++++++++++++++++++++

############
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic isn't really changed all that much, just moved into function main_func(); choosing the "Hide whitespace" option will make this bit easier to review.

Comment thread tools/lib/__init__.py
@@ -0,0 +1,8 @@
"""Shared utilities for ESM Standard Names tools."""
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This allows us to import these functions directly from tools in the directory above; note the simplified logic in imports at the top of those scripts.

Comment thread tools/check_xml_unique.py
#get list of all standard names
all_std_names = []
for name in root.findall('./section/standard_name'):
for name in root.findall('.//standard_name'):
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change that started this whole rabbit hole: we need to check for ALL standard_name entries, not just the top-level ones

Comment thread tools/check_xml_unique.py
else:
rm_elements = root.findall(f'.//standard_name[@{args.field}="{dup}"]')[1:]
print(f"{dup}, ({len(rm_elements)} duplicate(s))")
if args.overwrite:
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a judgement call here that we always want to manually resolve duplicate entries. Doing it automatically seems prone to error (e.g. they were intended to be separate entries but had the same standard name due to a typo).

def standard_name_to_description(prop_dict, context=None):
def standard_name_to_description(prop_dict):
########################################################################
"""Translate a standard_name to its default description
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These checks were remnants from code originally copied from the framework, and have no function here

match = _REAL_SUBST_RE.match(description)
# end while
else:
description = ''
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More logic leftover from ccpp-framework

file_ok = validate_xml_file(stdname_file, schema_name,
None, schema_path=schema_root,
error_on_noxmllint=True)
except ValueError as valerr:
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is convoluted logic to catch an error raised by the validate_xml_file function....it's way simpler and better practice to just let the exception be raised and reference the resulting error trace.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update python scripts to use f-strings

1 participant