The ecco_access “package”: a starting point for accessing ECCO output on PO.DAAC#

Andrew Delman, updated 2024-10-13

Introduction
Add ecco_access to your Python path
Using the ecco_podaac_to_xrdataset function
Using the ecco_podaac_access function

Introduction#

In the past several years since ECCOv4 release 4 output was made available on the Physical Oceanography Distributed Active Archive Center or PO.DAAC, a number of Python scripts/functions have been written to facilitate requests of this output, authored by Jack McNelis, Ian Fenty, and Andrew Delman. To make access easier and standardize the format of these requests, the ecco_access library has been made available in the ecco_access folder of the ECCO-v4-Python-Tutorial Github repository.

This library I am calling a “package” in quotes because it currently has the core structure of any package you would install using conda or pip; there is an __init__.py file that allows you to access all of the library’s modules and the functions within, using a single import ecco_access command. However, this “package” is not available through conda or pip yet. For the convenience of the ECCO Hackathon, ecco_access has been copied over to the ecco-2024 repo used in the hackathon, with symbolic links included in the existing tutorial directories so that these tutorials can immediately use the library.

This tutorial will help you get set up with ecco_access so that you can use it from your tutorial directories on OSS or wherever else you need it. It also introduces the two top-level functions that you would likely use in ecco_access:

  • ecco_podaac_to_xrdataset: takes as input a text query or ECCO dataset identifier, and returns an xarray Dataset

  • ecco_podaac_access: takes the same input, but returns the URLs/paths or local files where the data is located

Add ecco_access to your Python path#

For more extensive use of this “package” (and sharing any edits with the community), I recommend cloning the ECCO-v4-Python-Tutorial repo and then adding the ecco_access folder to your Python path, using the steps below.

Clone the ECCO-v4-Python-Tutorial repository#

Navigate to your home directory before cloning the ECCO-v4-Python-Tutorial repo. This way the repo will appear as a directory under your home directory and is easily accessed.

cd ~
(notebook) jovyan@jupyter-adelman:~$ git clone git@github.com:ECCO-GROUP/ECCO-v4-Python-Tutorial.git

Using the ecco_podaac_to_xrdataset function#

Perhaps the most convenient way to use ecco_access is the ecco_podaac_to_xrdataset; it takes as input a query consisting of NASA Earthdata dataset ShortName(s), ECCO variables, or text strings in the variable descriptions, and outputs an xarray Dataset. Let’s look at the syntax:

import numpy as np
import xarray as xr
from os.path import join,expanduser
import matplotlib.pyplot as plt

import ecco_access as ea

# identify user's home directory
user_home_dir = expanduser('~')
help(ea.ecco_podaac_to_xrdataset)
Help on function ecco_podaac_to_xrdataset in module ecco_access.ecco_access:

ecco_podaac_to_xrdataset(query, version='v4r4', grid=None, time_res='all', StartDate=None, EndDate=None,\ 
                         snapshot_interval=None, mode='download_ifspace', download_root_dir=None, **kwargs)
    This function queries and accesses ECCO datasets from PO.DAAC. The core query and download functions are adapted from Jupyter notebooks 
    created by Jack McNelis and Ian Fenty 
    (https://github.com/ECCO-GROUP/ECCO-ACCESS/blob/master/PODAAC/Downloading_ECCO_datasets_from_PODAAC/README.md)
    and modified by Andrew Delman (https://ecco-v4-python-tutorial.readthedocs.io).
    It is similar to ecco_podaac_access, except instead of a list of URLs or files, 
    an xarray Dataset with all of the queried ECCO datasets is returned.
    
    Parameters
    ----------    
    query: str, list, or dict, defines datasets or variables to access.
           If query is str, it specifies either a dataset ShortName (which is 
           assumed if the string begins with 'ECCO_'), or a text string that 
           can be used to search the ShortNames, variable names, and descriptions.
           A query may also be a list of multiple ShortNames and/or text searches, 
           or a dict that contains grid,time_res specifiers as keys and ShortNames 
           or text searches as values, e.g.,
           {'native,monthly':['ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4',
                              'THETA']}
           will query the native grid monthly SSH datasets, and all native grid 
           monthly datasets with variables or descriptions matching 'THETA'.
    
    version: ('v4r4'), specifies ECCO version to query
    
    grid: ('native','latlon',None), specifies whether to query datasets with output
          on the native grid or the interpolated lat/lon grid.
          The default None will query both types of grids, unless specified 
          otherwise in a query dict (e.g., the example above).
    
    time_res: ('monthly','daily','snapshot','all'), specifies which time resolution 
              to include in query and downloads. 'all' includes all time resolutions, 
              and datasets that have no time dimension, such as the grid parameter 
              and mixing coefficient datasets.
    
    
    StartDate,EndDate: str, in 'YYYY', 'YYYY-MM', or 'YYYY-MM-DD' format, 
                       define date range [StartDate,EndDate] for download.
                       EndDate is included in the time range (unlike typical Python ranges).
                       Full ECCOv4r4 date range (default) is '1992-01-01' to '2017-12-31'.
                       For 'SNAPSHOT' datasets, an additional day is added to EndDate to enable closed budgets
                       within the specified date range.
    
    snapshot_interval: ('monthly', 'daily', or None), if snapshot datasets are included in ShortNames, 
                       this determines whether snapshots are included for only the beginning/end of each month 
                       ('monthly'), or for every day ('daily').
                       If None or not specified, defaults to 'daily' if any daily mean ShortNames are included 
                       and 'monthly' otherwise.
    
    mode: str, one of the following:
          'download': Download datasets using NASA Earthdata URLs
          'download_ifspace': Check storage availability before downloading.
                              Download only if storage footprint of downloads 
                              <= max_avail_frac*(available storage)
          'download_subset': Download spatial and temporal subsets of datasets 
                             via Opendap; query help(ecco_access.ecco_podaac_download_subset)
                             to see keyword arguments that can be used in this mode.
          The following modes work within the AWS cloud only:
          's3_open': Access datasets on S3 without downloading.
          's3_open_fsspec': Use json files (generated with `fsspec` and `kerchunk`) 
                            for expedited opening of datasets.
          's3_get': Download from S3 (to AWS EC2 instance).
          's3_get_ifspace': Check storage availability before downloading; 
                            download if storage footprint 
                            <= max_avail_frac*(available storage).
                            Otherwise data are opened "remotely" from S3 bucket.
    
    download_root_dir: str, defines parent directory to download files to.
                       Files will be downloaded to directory download_root_dir/ShortName/.
                       If not specified, parent directory defaults to '~/Downloads/ECCO_V4r4_PODAAC/'.
    
    Additional keyword arguments*:
    *This is not an exhaustive list, especially for 
    'download_subset' mode; use help(ecco_access.ecco_podaac_download_subset) to display 
    options specific to that mode
    
    max_avail_frac: float, maximum fraction of remaining available disk space to 
                    use in storing ECCO datasets.
                    If storing the datasets exceeds this fraction, an error is returned.
                    Valid range is [0,0.9]. If number provided is outside this range, it is replaced by the closer 
                    endpoint of the range.
    
    jsons_root_dir: str, for s3_open_fsspec mode only, the root/parent directory where the 
                    fsspec/kerchunk-generated jsons are found.
                    jsons are generated using the steps described here:
                    https://medium.com/pangeo/fake-it-until-you-make-it-reading-goes-netcdf4-data-on-aws-s3-as-zarr
                    -for-rapid-data-access-61e33f8fe685
                    and stored as {jsons_root_dir}/MZZ_{GRIDTYPE}_{TIME_RES}/{SHORTNAME}.json.
                    For v4r4, GRIDTYPE is '05DEG' or 'LLC0090GRID'.
                    TIME_RES is one of: ('MONTHLY','DAILY','SNAPSHOT','GEOMETRY','MIXING_COEFFS').
    
    n_workers: int, number of workers to use in concurrent downloads. Benefits typically taper off above 5-6.
    
    force_redownload: bool, if True, existing files will be redownloaded and replaced;
                            if False (default), existing files will not be replaced.
    
    Returns
    -------
    ds_out: xarray Dataset or dict of xarray Datasets (with ShortNames as keys), 
            containing all of the accessed datasets.
            This function does not work with the query modes: 'ls','query','s3_ls','s3_query'.

There are a lot of options that you can use to “submit” a query with this function. Let’s consider a simple case, where we already have the ShortName for the monthly native grid SSH from ECCOv4r4 (ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4), and we want to access output from the year 2017. The ShortName goes in the query field, and we can specify start and end dates (in YYYY-MM or YYYY-MM-DD format). The other options that matter most for this request are the mode, and depending on the mode, the download_root_dir or the jsons_root_dir.

Direct download over the internet (mode = ‘download’)#

Let’s try the download mode, which retrieves the data over the Internet using NASA Earthdata URLs (this should work on any machine with Internet access, including cloud environments):

# download data and open xarray dataset
curr_shortname = 'ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4'
ds_SSH = ea.ecco_podaac_to_xrdataset(curr_shortname,\
                                        StartDate='2017-01',EndDate='2017-12',\
                                        mode='download',\
                                        download_root_dir=join(user_home_dir,'Downloads','ECCO_V4r4_PODAAC'))
created download directory /home/jovyan/Downloads/ECCO_V4r4_PODAAC/ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4
DL Progress: 100%|#########################| 12/12 [00:05<00:00,  2.16it/s]

=====================================
total downloaded: 71.02 Mb
avg download speed: 12.76 Mb/s
Time spent = 5.56673002243042 seconds

We specified a root directory for the download (which also happens to be the default setting), and the data files are then placed under download_root_dir / ShortName. We can verify that the contents of the file are what we queried:

ds_SSH
<xarray.Dataset> Size: 25MB
Dimensions:    (time: 12, tile: 13, j: 90, i: 90, i_g: 90, j_g: 90, nv: 2, nb: 4)
Coordinates: (12/13)
  * i          (i) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89
  * i_g        (i_g) int32 360B 0 1 2 3 4 5 6 7 8 ... 81 82 83 84 85 86 87 88 89
  * j          (j) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89
  * j_g        (j_g) int32 360B 0 1 2 3 4 5 6 7 8 ... 81 82 83 84 85 86 87 88 89
  * tile       (tile) int32 52B 0 1 2 3 4 5 6 7 8 9 10 11 12
  * time       (time) datetime64[ns] 96B 2017-01-16T12:00:00 ... 2017-12-16T0...
    ...         ...
    YC         (tile, j, i) float32 421kB dask.array<chunksize=(13, 90, 90), meta=np.ndarray>
    XG         (tile, j_g, i_g) float32 421kB dask.array<chunksize=(13, 90, 90), meta=np.ndarray>
    YG         (tile, j_g, i_g) float32 421kB dask.array<chunksize=(13, 90, 90), meta=np.ndarray>
    time_bnds  (time, nv) datetime64[ns] 192B dask.array<chunksize=(1, 2), meta=np.ndarray>
    XC_bnds    (tile, j, i, nb) float32 2MB dask.array<chunksize=(13, 90, 90, 4), meta=np.ndarray>
    YC_bnds    (tile, j, i, nb) float32 2MB dask.array<chunksize=(13, 90, 90, 4), meta=np.ndarray>
Dimensions without coordinates: nv, nb
Data variables:
    SSH        (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray>
    SSHIBC     (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray>
    SSHNOIBC   (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray>
    ETAN       (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray>
Attributes: (12/57)
    acknowledgement:              This research was carried out by the Jet Pr...
    author:                       Ian Fenty and Ou Wang
    cdm_data_type:                Grid
    comment:                      Fields provided on the curvilinear lat-lon-...
    Conventions:                  CF-1.8, ACDD-1.3
    coordinates_comment:          Note: the global 'coordinates' attribute de...
    ...                           ...
    time_coverage_duration:       P1M
    time_coverage_end:            2017-02-01T00:00:00
    time_coverage_resolution:     P1M
    time_coverage_start:          2017-01-01T00:00:00
    title:                        ECCO Sea Surface Height - Monthly Mean llc9...
    uuid:                         a21a5c30-400c-11eb-a9e0-0cc47a3f49c3

In-cloud direct access with pre-generated json files (mode = ‘s3_open_fsspec’)#

Now, if you are part of the ECCO Hackweek you are also working in a cloud environment which means that you have many more access modes open to you. Let’s try s3_open_fsspec, which opens the files from S3 (no download necessary), and uses json files with the data chunking information to open the files exceptionally fast. This means you need to provide the directory where the jsons are located, on the efs_ecco drive: /efs_ecco/mzz-jsons.

curr_shortname = 'ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4'
ds_SSH_s3 = ea.ecco_podaac_to_xrdataset(curr_shortname,\
                                        StartDate='2017-01',EndDate='2017-12',\
                                        mode='s3_open_fsspec',\
                                        jsons_root_dir=join('/efs_ecco','mzz-jsons'))
ds_SSH_s3
<xarray.Dataset> Size: 25MB
Dimensions:    (time: 12, tile: 13, j: 90, i: 90, nb: 4, j_g: 90, i_g: 90, nv: 2)
Coordinates: (12/13)
    XC         (tile, j, i) float32 421kB ...
    XC_bnds    (tile, j, i, nb) float32 2MB ...
    XG         (tile, j_g, i_g) float32 421kB ...
    YC         (tile, j, i) float32 421kB ...
    YC_bnds    (tile, j, i, nb) float32 2MB ...
    YG         (tile, j_g, i_g) float32 421kB ...
    ...         ...
  * i_g        (i_g) int32 360B 0 1 2 3 4 5 6 7 8 ... 81 82 83 84 85 86 87 88 89
  * j          (j) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89
  * j_g        (j_g) int32 360B 0 1 2 3 4 5 6 7 8 ... 81 82 83 84 85 86 87 88 89
  * tile       (tile) int32 52B 0 1 2 3 4 5 6 7 8 9 10 11 12
  * time       (time) datetime64[ns] 96B 2017-01-16T12:00:00 ... 2017-12-16T0...
    time_bnds  (time, nv) datetime64[ns] 192B ...
Dimensions without coordinates: nb, nv
Data variables:
    ETAN       (time, tile, j, i) float32 5MB ...
    SSH        (time, tile, j, i) float32 5MB ...
    SSHIBC     (time, tile, j, i) float32 5MB ...
    SSHNOIBC   (time, tile, j, i) float32 5MB ...
Attributes: (12/57)
    Conventions:                  CF-1.8, ACDD-1.3
    acknowledgement:              This research was carried out by the Jet Pr...
    author:                       Ian Fenty and Ou Wang
    cdm_data_type:                Grid
    comment:                      Fields provided on the curvilinear lat-lon-...
    coordinates_comment:          Note: the global 'coordinates' attribute de...
    ...                           ...
    time_coverage_duration:       P1M
    time_coverage_end:            1992-02-01T00:00:00
    time_coverage_resolution:     P1M
    time_coverage_start:          1992-01-01T12:00:00
    title:                        ECCO Sea Surface Height - Monthly Mean llc9...
    uuid:                         9302811e-400c-11eb-b69e-0cc47a3f49c3

Now plot the SSH for Jan 2017 in tile 10 (Python numbering convention; 11 in Fortran/MATLAB numbering convention). Here we use the “RdYlBu” colormap, one of many built-in colormaps that the matplotlib package provides, or you can create your own. The “_r” at the end reverses the direction of the colormap, so red corresponds to the maximum values.

ds_SSH_s3.SSH.isel(time=0,tile=10).plot(cmap='RdYlBu_r')
<matplotlib.collections.QuadMesh at 0x7fe336d81750>
../../_images/b9760b5cdfeea0679cc0c105d991a80043f43e0b7e685ab97d46252da7b34158.png

We can also use the ecco_v4_py package to plot a global map of Jan 2017 SSH, using the plot_proj_to_latlon_grid function which regrids from the native LLC grid to a lat/lon grid.

import ecco_v4_py as ecco

plt.figure(figsize=(12,6), dpi= 90)
ecco.plot_proj_to_latlon_grid(ds_SSH_s3.XC, ds_SSH_s3.YC, \
                              ds_SSH_s3.SSH.isel(time=0), \
                              user_lon_0=-160,\
                              projection_type='robin',\
                              plot_type='pcolormesh', \
                              cmap='RdBu_r',\
                              dx=1,dy=1,cmin=-1.2, cmax=1.2,show_colorbar=True)
plt.title('Monthly mean SSH [m], Jan 2017')
plt.show()
../../_images/6c817c78be90ffddd8bcd2e2415bd4fb66b62057752e5e0cb580c0707ba5fb06.png

What if you don’t know the ShortName already?#

NASA Earthdata datasets are identified by ShortNames, but you might not know the ShortName of the variable or category of variables that you are seeking. One way to find the ShortName is to consult these ECCOv4r4 variable lists. But the “query” in ecco_access functions does not have to be a ShortName; it can also be a text string representing a variable name, or a word or phrase in the variable description.

For example, perhaps you are looking to open the dataset that has native grid monthly sea ice concentration in 2007. If the query is not identified as a ShortName, then a text search of the variable lists is conducted using query, grid, and time_res. Then of the identified matches, the user is asked to select one.

ds_seaice_conc = ea.ecco_podaac_to_xrdataset('ice',grid='native',time_res='monthly',\
                                               StartDate='2007-01',EndDate='2007-12',\
                                               mode='s3_open_fsspec',\
                                               jsons_root_dir=join('/efs_ecco','mzz-jsons'))
ShortName Options for query "ice":
                  Variable Name     Description (units)

Option 1: ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4    *native grid,monthly means*
                  SSH               Dynamic sea surface height anomaly. Suitable for
                                    comparisons with altimetry sea surface height data
                                    products that apply the inverse barometer
                                    correction. (m)
                  SSHIBC            The inverted barometer correction to sea surface
                                    height due to atmospheric pressure loading. (m)
                  SSHNOIBC          Sea surface height anomaly without the inverted
                                    barometer correction. Suitable for comparisons
                                    with altimetry sea surface height data products
                                    that do NOT apply the inverse barometer
                                    correction. (m)
                  ETAN              Model sea level anomaly, without corrections for
                                    global mean density changes, inverted barometer
                                    effect, or volume displacement due to submerged
                                    sea-ice and snow. (m)

Option 2: ECCO_L4_STRESS_LLC0090GRID_MONTHLY_V4R4    *native grid,monthly means*
                  EXFtaux           Wind stress in the model +x direction (N/m^2)
                  EXFtauy           Wind stress in the model +y direction (N/m^2)
                  oceTAUX           Ocean surface stress in the model +x direction,
                                    due to wind and sea-ice (N/m^2)
                  oceTAUY           Ocean surface stress in the model +y direction,
                                    due to wind and sea-ice (N/m^2)

Option 3: ECCO_L4_HEAT_FLUX_LLC0090GRID_MONTHLY_V4R4    *native grid,monthly means*
                  EXFhl             Open ocean air-sea latent heat flux (W/m^2)
                  EXFhs             Open ocean air-sea sensible heat flux (W/m^2)
                  EXFlwdn           Downward longwave radiative flux (W/m^2)
                  EXFswdn           Downwelling shortwave radiative flux (W/m^2)
                  EXFqnet           Open ocean net air-sea heat flux (W/m^2)
                  oceQnet           Net heat flux into the ocean surface (W/m^2)
                  SIatmQnt          Net upward heat flux to the atmosphere (W/m^2)
                  TFLUX             Rate of change of ocean heat content per m^2
                                    accounting for mass (e.g. freshwater) fluxes
                                    (W/m^2)
                  EXFswnet          Open ocean net shortwave radiative flux (W/m^2)
                  EXFlwnet          Net open ocean longwave radiative flux (W/m^2)
                  oceQsw            Net shortwave radiative flux across the ocean
                                    surface (W/m^2)
                  SIaaflux          Conservative ocean and sea-ice advective heat flux
                                    adjustment, associated with temperature difference
                                    between sea surface temperature and sea-ice,
                                    excluding latent heat of fusion (W/m^2)

Option 4: ECCO_L4_FRESH_FLUX_LLC0090GRID_MONTHLY_V4R4    *native grid,monthly means*
                  EXFpreci          Precipitation rate (m/s)
                  EXFevap           Open ocean evaporation rate (m/s)
                  EXFroff           River runoff (m/s)
                  SIsnPrcp          Snow precipitation on sea-ice (kg/(m^2 s))
                  EXFempmr          Open ocean net surface freshwater flux from
                                    precipitation, evaporation, and runoff (m/s)
                  oceFWflx          Net freshwater flux into the ocean (kg/(m^2 s))
                  SIatmFW           Net freshwater flux into the open ocean, sea-ice,
                                    and snow (kg/(m^2 s))
                  SFLUX             Rate of change of total ocean salinity per m^2
                                    accounting for mass fluxes (g/(m^2 s))
                  SIacSubl          Freshwater flux to the atmosphere due to
                                    sublimation-deposition of snow or ice (kg/(m^2 s))
                  SIrsSubl          Residual sublimation freshwater flux (kg/(m^2 s))
                  SIfwThru          Precipitation through sea-ice (kg/(m^2 s))

Option 5: ECCO_L4_SEA_ICE_CONC_THICKNESS_LLC0090GRID_MONTHLY_V4R4    *native grid,monthly means*
                  SIarea            Sea-ice concentration (fraction between 0 and 1)
                  SIheff            Area-averaged sea-ice thickness (m)
                  SIhsnow           Area-averaged snow thickness (m)
                  sIceLoad          Average sea-ice and snow mass per unit area
                                    (kg/m^2)

Option 6: ECCO_L4_SEA_ICE_VELOCITY_LLC0090GRID_MONTHLY_V4R4    *native grid,monthly means*
                  SIuice            Sea-ice velocity in the model +x direction (m/s)
                  SIvice            Sea-ice velocity in the model +y direction (m/s)

Option 7: ECCO_L4_SEA_ICE_HORIZ_VOLUME_FLUX_LLC0090GRID_MONTHLY_V4R4    *native grid,monthly means*
                  ADVxHEFF          Lateral advective flux of sea-ice thickness in the
                                    model +x direction (m^3/s)
                  ADVyHEFF          Lateral advective flux of sea-ice thickness in the
                                    model +y direction (m^3/s)
                  DFxEHEFF          Lateral diffusive flux of sea-ice thickness in the
                                    model +x direction (m^3/s)
                  DFyEHEFF          Lateral diffusive flux of sea-ice thickness in the
                                    model +y direction (m^3/s)
                  ADVxSNOW          Lateral advective flux of snow thickness in the
                                    model +x direction (m^3/s)
                  ADVySNOW          Lateral advective flux of snow thickness in the
                                    model +y direction (m^3/s)
                  DFxESNOW          Lateral diffusive flux of snow thickness in the
                                    model +x direction (m^3/s)
                  DFyESNOW          Lateral diffusive flux of snow thickness in the
                                    model +y direction (m^3/s)

Option 8: ECCO_L4_SEA_ICE_SALT_PLUME_FLUX_LLC0090GRID_MONTHLY_V4R4    *native grid,monthly means*
                  oceSPflx          Net salt flux into the ocean due to brine
                                    rejection (g/(m^2 s))
                  oceSPDep          Salt plume depth (m)


Using dataset with ShortName: ECCO_L4_SEA_ICE_CONC_THICKNESS_LLC0090GRID_MONTHLY_V4R4

We selected option 5, corresponding to ShortName ECCO_L4_SEA_ICE_CONC_THICKNESS_LLC0090GRID_MONTHLY_V4R4. Let’s look at the dataset contents.

ds_seaice_conc
<xarray.Dataset> Size: 25MB
Dimensions:    (time: 12, tile: 13, j: 90, i: 90, nb: 4, j_g: 90, i_g: 90, nv: 2)
Coordinates: (12/13)
    XC         (tile, j, i) float32 421kB ...
    XC_bnds    (tile, j, i, nb) float32 2MB ...
    XG         (tile, j_g, i_g) float32 421kB ...
    YC         (tile, j, i) float32 421kB ...
    YC_bnds    (tile, j, i, nb) float32 2MB ...
    YG         (tile, j_g, i_g) float32 421kB ...
    ...         ...
  * i_g        (i_g) int32 360B 0 1 2 3 4 5 6 7 8 ... 81 82 83 84 85 86 87 88 89
  * j          (j) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89
  * j_g        (j_g) int32 360B 0 1 2 3 4 5 6 7 8 ... 81 82 83 84 85 86 87 88 89
  * tile       (tile) int32 52B 0 1 2 3 4 5 6 7 8 9 10 11 12
  * time       (time) datetime64[ns] 96B 2007-01-16T12:00:00 ... 2007-12-16T1...
    time_bnds  (time, nv) datetime64[ns] 192B ...
Dimensions without coordinates: nb, nv
Data variables:
    SIarea     (time, tile, j, i) float32 5MB ...
    SIheff     (time, tile, j, i) float32 5MB ...
    SIhsnow    (time, tile, j, i) float32 5MB ...
    sIceLoad   (time, tile, j, i) float32 5MB ...
Attributes: (12/57)
    Conventions:                  CF-1.8, ACDD-1.3
    acknowledgement:              This research was carried out by the Jet Pr...
    author:                       Ian Fenty and Ou Wang
    cdm_data_type:                Grid
    comment:                      Fields provided on the curvilinear lat-lon-...
    coordinates_comment:          Note: the global 'coordinates' attribute de...
    ...                           ...
    time_coverage_duration:       P1M
    time_coverage_end:            1992-02-01T00:00:00
    time_coverage_resolution:     P1M
    time_coverage_start:          1992-01-01T12:00:00
    title:                        ECCO Sea-Ice and Snow Concentration and Thi...
    uuid:                         cc62f1c2-400d-11eb-9f11-0cc47a3f49c3

Now plot the sea ice concentration/fraction in tile 6 (which approximately covers the Arctic Ocean), during Sep 2007 which at the time was a record minimum for Arctic sea ice.

ds_seaice_conc.SIarea.isel(time=8,tile=6).plot(cmap='cool')
<matplotlib.collections.QuadMesh at 0x7fe3363eb810>
../../_images/83c21a1d5712a9a32de9187c022c8c0e460ba949dd7361ca52581d2fa46ec65c.png

Using the ecco_podaac_access function#

In-cloud direct access (mode = ‘s3_open’)#

The ecco_podaac_to_xrdataset function that was previously used invokes ecco_podaac_access under the hood, and ecco_podaac_access can also be called directly. This can be useful if you want to obtain a list of file objects/paths or URLs that you can then process with your own code. Let’s use this function with mode = s3_open (all s3 modes only work from an AWS cloud environment in region us-west-2).

files_dict = ea.ecco_podaac_access(curr_shortname,\
                                    StartDate='2015-01',EndDate='2015-12',\
                                    mode='s3_open')
{'ShortName': 'ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4', 'temporal': '2015-01-02,2015-12-31'}

Total number of matching granules: 12
files_dict[curr_shortname]
[<File-like object S3FileSystem, podaac-ops-cumulus-protected/ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4/SEA_SURFACE_HEIGHT_mon_mean_2015-01_ECCO_V4r4_native_llc0090.nc>,
 <File-like object S3FileSystem, podaac-ops-cumulus-protected/ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4/SEA_SURFACE_HEIGHT_mon_mean_2015-02_ECCO_V4r4_native_llc0090.nc>,
 <File-like object S3FileSystem, podaac-ops-cumulus-protected/ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4/SEA_SURFACE_HEIGHT_mon_mean_2015-03_ECCO_V4r4_native_llc0090.nc>,
 <File-like object S3FileSystem, podaac-ops-cumulus-protected/ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4/SEA_SURFACE_HEIGHT_mon_mean_2015-04_ECCO_V4r4_native_llc0090.nc>,
 <File-like object S3FileSystem, podaac-ops-cumulus-protected/ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4/SEA_SURFACE_HEIGHT_mon_mean_2015-05_ECCO_V4r4_native_llc0090.nc>,
 <File-like object S3FileSystem, podaac-ops-cumulus-protected/ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4/SEA_SURFACE_HEIGHT_mon_mean_2015-06_ECCO_V4r4_native_llc0090.nc>,
 <File-like object S3FileSystem, podaac-ops-cumulus-protected/ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4/SEA_SURFACE_HEIGHT_mon_mean_2015-07_ECCO_V4r4_native_llc0090.nc>,
 <File-like object S3FileSystem, podaac-ops-cumulus-protected/ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4/SEA_SURFACE_HEIGHT_mon_mean_2015-08_ECCO_V4r4_native_llc0090.nc>,
 <File-like object S3FileSystem, podaac-ops-cumulus-protected/ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4/SEA_SURFACE_HEIGHT_mon_mean_2015-09_ECCO_V4r4_native_llc0090.nc>,
 <File-like object S3FileSystem, podaac-ops-cumulus-protected/ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4/SEA_SURFACE_HEIGHT_mon_mean_2015-10_ECCO_V4r4_native_llc0090.nc>,
 <File-like object S3FileSystem, podaac-ops-cumulus-protected/ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4/SEA_SURFACE_HEIGHT_mon_mean_2015-11_ECCO_V4r4_native_llc0090.nc>,
 <File-like object S3FileSystem, podaac-ops-cumulus-protected/ECCO_L4_SSH_LLC0090GRID_MONTHLY_V4R4/SEA_SURFACE_HEIGHT_mon_mean_2015-12_ECCO_V4r4_native_llc0090.nc>]

The output of ecco_podaac_access is in the form of a dictionary with ShortNames as keys. In this case, the value associated with this ShortName is a list of 12 file objects. These are files on S3 (AWS’s cloud storage system) that have been opened, which is a necessary step for the files’ data to be accessed. The list of open files can be passed directly to xarray.open_mfdataset.

ds_SSH_fromlist = xr.open_mfdataset(files_dict[curr_shortname],\
                                    compat='override',data_vars='minimal',coords='minimal',\
                                    parallel=True)
ds_SSH_fromlist
<xarray.Dataset> Size: 25MB
Dimensions:    (time: 12, tile: 13, j: 90, i: 90, i_g: 90, j_g: 90, nv: 2, nb: 4)
Coordinates: (12/13)
  * i          (i) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89
  * i_g        (i_g) int32 360B 0 1 2 3 4 5 6 7 8 ... 81 82 83 84 85 86 87 88 89
  * j          (j) int32 360B 0 1 2 3 4 5 6 7 8 9 ... 81 82 83 84 85 86 87 88 89
  * j_g        (j_g) int32 360B 0 1 2 3 4 5 6 7 8 ... 81 82 83 84 85 86 87 88 89
  * tile       (tile) int32 52B 0 1 2 3 4 5 6 7 8 9 10 11 12
  * time       (time) datetime64[ns] 96B 2015-01-16T12:00:00 ... 2015-12-16T1...
    ...         ...
    YC         (tile, j, i) float32 421kB dask.array<chunksize=(13, 90, 90), meta=np.ndarray>
    XG         (tile, j_g, i_g) float32 421kB dask.array<chunksize=(13, 90, 90), meta=np.ndarray>
    YG         (tile, j_g, i_g) float32 421kB dask.array<chunksize=(13, 90, 90), meta=np.ndarray>
    time_bnds  (time, nv) datetime64[ns] 192B dask.array<chunksize=(1, 2), meta=np.ndarray>
    XC_bnds    (tile, j, i, nb) float32 2MB dask.array<chunksize=(13, 90, 90, 4), meta=np.ndarray>
    YC_bnds    (tile, j, i, nb) float32 2MB dask.array<chunksize=(13, 90, 90, 4), meta=np.ndarray>
Dimensions without coordinates: nv, nb
Data variables:
    SSH        (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray>
    SSHIBC     (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray>
    SSHNOIBC   (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray>
    ETAN       (time, tile, j, i) float32 5MB dask.array<chunksize=(1, 13, 90, 90), meta=np.ndarray>
Attributes: (12/57)
    acknowledgement:              This research was carried out by the Jet Pr...
    author:                       Ian Fenty and Ou Wang
    cdm_data_type:                Grid
    comment:                      Fields provided on the curvilinear lat-lon-...
    Conventions:                  CF-1.8, ACDD-1.3
    coordinates_comment:          Note: the global 'coordinates' attribute de...
    ...                           ...
    time_coverage_duration:       P1M
    time_coverage_end:            2015-02-01T00:00:00
    time_coverage_resolution:     P1M
    time_coverage_start:          2015-01-01T00:00:00
    title:                        ECCO Sea Surface Height - Monthly Mean llc9...
    uuid:                         a4955186-400c-11eb-8c14-0cc47a3f49c3