Load Data¶

pyshbundle contains scripts to seemlessly extract and load spherical harmonics coefficients from various data centres. See Tutorials

¶

`extract_SH_data(file_path, source)` ¶

Extracts the spherical harmonic coefficients from all the given files.

Currently supports JPL, CSR, and ITSG data sources ONLY. Extracts the spherical harmonic coefficients from the given file and returns them in a dictionary. Uses the degree and order of a coefficient as the key and the coefficient values as the value.

Parameters:

Name	Type	Description	Default
`file_path`	`str`	Absolute path to the file.	required
`source`	`str`	Source of the data (JPL, CSR, or ITSG).	required

Returns:

Type	Description
`dict`	Dictionary containing the coefficients and time coverage start and end dates.

Source code in pyshbundle/io.py

def extract_SH_data(file_path, source):
    """
    Extracts the spherical harmonic coefficients from all the given files.

    Currently supports JPL, CSR, and ITSG data sources ONLY. Extracts the spherical harmonic 
    coefficients from the given file and returns them in a dictionary. Uses the degree and 
    order of a coefficient as the key and the coefficient values as the value.

    Args:
        file_path (str): Absolute path to the file.
        source (str): Source of the data (JPL, CSR, or ITSG).

    Returns:
        (dict): Dictionary containing the coefficients and time coverage start and end dates.
    """
    # Initialize an empty dictionary to store the coefficients and dates
    data = {
        'coefficients': {},
        'time_coverage_start': None,
        'time_coverage_end': None
    }


    # Regular expression pattern to match the lines with coefficients
    coeff_pattern_csr = re.compile(r'^GRCOF2\s+(\d+)\s+(\d+)\s+([-+]?\d*\.\d+E[-+]?\d+)\s+([-+]?\d*\.\d+E[-+]?\d+)\s+([-+]?\d*\.\d+E[-+]?\d+)\s+([-+]?\d*\.\d+E[-+]?\d+)\s+(\d+\.\d+)\s+(\d+\.\d+)\s+')
    coeff_pattern_jpl = re.compile(r'^GRCOF2\s+(\d+)\s+(\d+)\s+([-+]?\d*\.\d+e[-+]?\d+)\s+([-+]?\d*\.\d+e[-+]?\d+)\s+([-+]?\d*\.\d+e[-+]?\d+)\s+([-+]?\d*\.\d+e[-+]?\d+)\s+(\d+\.\d+)\s+(\d+\.\d+)')
    coeff_pattern_itsg = re.compile(r'^gfc\s+(\d+)\s+(\d+)\s+([-+]?\d*\.\d+e[-+]?\d+)\s+([-+]?\d*\.\d+e[-+]?\d+)\s+([-+]?\d*\.\d+e[-+]?\d+)\s+([-+]?\d*\.\d+e[-+]?\d+)$')


    if source=='jpl': coeff_pattern=coeff_pattern_jpl
    elif source=='csr': coeff_pattern=coeff_pattern_csr
    elif source=='itsg': coeff_pattern=coeff_pattern_itsg
    else: raise ValueError("Invalid source, pyshbundle only supports JPL, CSR, and ITSG")


    # Regular expression patterns to match the time coverage start and end lines
    start_pattern = re.compile(r'time_coverage_start\s*:\s*([\d\-T:.]+)')
    end_pattern = re.compile(r'time_coverage_end\s*:\s*([\d\-T:.]+)')
    timeindex_itsg = re.compile(r'^modelname\s+(.+)$')


    # Open and read the gzipped file to extract the time coverage start and end dates
    if source=='itsg':
        with open(file_path, 'rt') as file:
            for line in file:
                # Strip any leading/trailing whitespace characters
                line = line.strip()

                # Search for time coverage start
                start_match = timeindex_itsg.search(line)
                if start_match:
                    data['time_coverage_start'] = start_match.group(1)

                # Break the loop if both dates are found
                if data['time_coverage_start']:
                    break
            # File is automatically closed here due to the 'with' statement
        with open(file_path, 'rt') as file:
            for line in file:
                # Strip any leading/trailing whitespace characters
                line = line.strip()
                # print(line)

                # Search for the coefficient pattern in the line
                coeff_match = coeff_pattern.search(line)
                if coeff_match:
                    # Extract degree, order, Clm, and Slm
                    degree = int(coeff_match.group(1))
                    order = int(coeff_match.group(2))
                    clm = np.double(coeff_match.group(3))
                    slm = np.double(coeff_match.group(4))
                    clm_sdev = np.double(coeff_match.group(5))
                    slm_sdev = np.double(coeff_match.group(6))

                    # Store the coefficients in the dictionary
                    data['coefficients'][(degree, order)] = {'Clm': clm, 'Slm': slm,
                                                            'Clm_sdev': clm_sdev, 'Slm_sdev': slm_sdev}



    elif source=='csr' or source=='jpl':
        with gzip.open(file_path, 'rt') as file:   # gzip.open
            for line in file:
                # Strip any leading/trailing whitespace characters
                line = line.strip()

                # Search for time coverage start
                start_match = start_pattern.search(line)
                if start_match:
                    data['time_coverage_start'] = start_match.group(1)

                # Search for time coverage end
                end_match = end_pattern.search(line)
                if end_match:
                    data['time_coverage_end'] = end_match.group(1)

                # Break the loop if both dates are found
                if data['time_coverage_start'] and data['time_coverage_end']:
                    break
            # File is automatically closed here due to the 'with' statement


        # Open and read the gzipped file again to extract the coefficients
        with gzip.open(file_path, 'rt') as file:
            for line in file:
                # Strip any leading/trailing whitespace characters
                line = line.strip()
                # print(line)

                # Search for the coefficient pattern in the line
                coeff_match = coeff_pattern.search(line)
                if coeff_match:
                    # Extract degree, order, Clm, and Slm
                    degree = int(coeff_match.group(1))
                    order = int(coeff_match.group(2))
                    clm = np.double(coeff_match.group(3))
                    slm = np.double(coeff_match.group(4))
                    clm_sdev = np.double(coeff_match.group(5))
                    slm_sdev = np.double(coeff_match.group(6))

                    # Store the coefficients in the dictionary
                    data['coefficients'][(degree, order)] = {'Clm': clm, 'Slm': slm,
                                                            'Clm_sdev': clm_sdev, 'Slm_sdev': slm_sdev}
    return data

`extract_deg1_coeff_tn13(file_path)` ¶

Extracts the degree 1 coefficients from the given TN-13 file.

Ensure the TN-13 file used is the one recommended by respective data centres (JPL, CSR, or ITSG). Similar to extract_SH_data, but specifically for TN-13 files. Returns degree 1 replacement coefficients as a dictionary. Uses the degree and order of a coefficient as the key and the coefficient values as the value.

Parameters:

Name	Type	Description	Default
`file_path`	`str`	Absolute path to the file.	required

Returns:

Type	Description
`dict`	Dictionary containing the degree 1 (order 1) coefficients and time coverage start and end dates.

Source code in pyshbundle/io.py

def extract_deg1_coeff_tn13(file_path):
    """
    Extracts the degree 1 coefficients from the given TN-13 file.

    Ensure the TN-13 file used is the one recommended by respective data centres (JPL, CSR, or ITSG).
    Similar to extract_SH_data, but specifically for TN-13 files.
    Returns degree 1 replacement coefficients as a dictionary.
    Uses the degree and order of a coefficient as the key and the coefficient values as the value.

    Args:
        file_path (str): Absolute path to the file.

    Returns:
        (dict): Dictionary containing the degree 1 (order 1) coefficients and time coverage start and end dates.
    """

    data_dict = {}

    with open(file_path, 'rt') as file:
        lines = file.readlines()
        for line in lines:

            # Extract data using regex
            pattern = re.compile(r'^GRCOF2\s+(\d+)\s+(\d+)\s+([-+]?\d*\.\d+e[-+]?\d+)\s+([-+]?\d*\.\d+e[-+]?\d+)\s+([-+]?\d*\.\d+e[-+]?\d+)\s+([-+]?\d*\.\d+e[-+]?\d+)\s+(\d+\.\d+)\s+(\d+\.\d+)')
            match = pattern.match(line)

            if match:
                degree = int(match.group(1))
                order = int(match.group(2))
                Clm = float(match.group(3))
                Slm = float(match.group(4))
                Clm_sdev = np.double(match.group(5))
                Slm_sdev = np.double(match.group(6))
                epoch_begin = match.group(7)
                epoch_end = match.group(8)

                # Use epoch start as key but in yyyy-mm-dd format
                epoch_key=datetime.strptime(epoch_begin, '%Y%m%d.%H%M%S').strftime('%Y-%m')
                data_dict[epoch_key, degree, order] = {
                    'degree': degree,
                    'order': order,
                    'Clm': Clm,
                    'Slm': Slm,
                    'Clm_sdev': Clm_sdev,
                    'Slm_sdev': Slm_sdev,
                    'epoch_begin': epoch_begin,
                    'epoch_end': epoch_end,
                }
    # Print a sample of the data to check if it's parsed correctly
    # for key in sorted(data_dict.keys())[:5]:  # print first 5 entries
    #     print(f"{key}: {data_dict[key]}")
    return data_dict

`extract_deg2_3_coeff_tn14(file_path)` ¶

Extracts the degree 2 and 3 coefficients from the given file.

Ensure the TN-14 file used is the one recommended by respective data centres (JPL, CSR, or ITSG). Similar to extract_SH_data, but specifically for TN-14 files. Returns degree 2, 3 replacement coefficients as a dictionary. Uses the degree and order of a coefficient as the key and the coefficient values as the value.

Parameters:

Name	Type	Description	Default
`file_path`	`str`	Absolute path to the file.	required

Returns:

Type	Description
`dict`	Dictionary containing the degree 2, 3 (order 0) coefficients and time coverage start and end dates.

Source code in pyshbundle/io.py

def extract_deg2_3_coeff_tn14(file_path):
    """
    Extracts the degree 2 and 3 coefficients from the given file.

    Ensure the TN-14 file used is the one recommended by respective data centres (JPL, CSR, or ITSG).
    Similar to extract_SH_data, but specifically for TN-14 files.
    Returns degree 2, 3 replacement coefficients as a dictionary.
    Uses the degree and order of a coefficient as the key and the coefficient values as the value.

    Args:
        file_path (str): Absolute path to the file.

    Returns:
        (dict): Dictionary containing the degree 2, 3 (order 0) coefficients and time coverage start and end dates.
    """
    data_dict = {}

    with open(file_path, 'rt') as file:
        lines = file.readlines()
        for line in lines:

            # Extract data using regex
            pattern = re.compile(
                r'(\d+\.\d+)\s+(\d+\.\d+)\s+([-\d.eE+]+)\s+([-\d.eE+]+)\s+([-\d.eE+]+)\s+([-\d.eE+]+|NaN)?\s+([-\d.eE+]+|NaN)?\s+([-\d.eE+]+|NaN)?\s+(\d+\.\d+)\s+(\d+\.\d+)')
            match = pattern.match(line)

            if match:
                mjd_start = float(match.group(1))
                year_frac_start = float(match.group(2))
                c20 = np.double(match.group(3))
                c20_mean_diff = np.double(match.group(4))
                c20_sigma = np.double(match.group(5))
                c30 = match.group(6)
                c30_mean_diff = match.group(7)
                c30_sigma = match.group(8)
                mjd_end = float(match.group(9))
                year_frac_end = float(match.group(10))

                # Only add C30 if it exists (not)
                if c30.lower() != 'nan':
                    c30 = np.double(c30)
                    c30_mean_diff = np.double(c30_mean_diff)
                    c30_sigma = np.double(c30_sigma)
                else:
                    c30 = None
                    c30_mean_diff = None
                    c30_sigma = None

                # Use mjd as key but in yyyy-mm-dd format
                mjd_key = julian.from_jd(mjd_start, fmt='mjd').date().strftime('%Y-%m')
                data_dict[mjd_key] = {
                    'year_frac_start': year_frac_start,
                    'mjd_start': mjd_start,
                    'c20': c20,
                    'c20_mean_diff': c20_mean_diff,
                    'c20_sigma': c20_sigma,
                    'c30': c30,
                    'c30_mean_diff': c30_mean_diff,
                    'c30_sigma': c30_sigma,
                    'mjd_end': mjd_end,
                    'year_frac_end': year_frac_end
                }
    # Print a sample of the data to check if it's parsed correctly
    # for key in sorted(data_dict.keys())[:5]:  # print first 5 entries
    #     print(f"{key}: {data_dict[key]}")
    return data_dict

`load_longterm_mean(source='', use_sample_mean=0)` ¶

Loads the long term mean values for the GRACE SH data.

Parameters:

Name	Type	Description	Default
`source`	`str`	Source of data. Defaults to "".	`''`
`use_sample_mean`	`int`	Whether to use default long-mean values provided with the data. Defaults to 0.	`0`

Raises:

Type	Description
`ValueError`	If the source selection is incorrect.

Returns:

Name	Type	Description
`str`		Path of the appropriate long term mean file.

Todo

Not sure if using "source = ''" is all right.
Instead of base exception, it can be ValueError.

Source code in pyshbundle/io.py

def load_longterm_mean(source = "", use_sample_mean = 0):
    """
    Loads the long term mean values for the GRACE SH data.

    Args:
        source (str, optional): Source of data. Defaults to "".
        use_sample_mean (int, optional): Whether to use default long-mean values provided with the data. Defaults to 0.

    Raises:
        ValueError: If the source selection is incorrect.

    Returns:
        str: Path of the appropriate long term mean file.

    Todo:
        + Not sure if using "source = ''" is all right.
        + Instead of base exception, it can be ValueError.
    """
# @author: Amin Shakya, Interdisciplinary Center for Water Research (ICWaR), Indian Institute of Science (IISc)

    if use_sample_mean == 1:
        print("Loading preloaded RL06 long term mean values")
        print("Please ensure that your data is RL06 \nIf not, please manually input long term mean by setting use_sample_mean = 0")

        if str.upper(source) == 'CSR':
            long_mean = pkg_resources.resource_filename('pyshbundle', 'data/RL06_long_mean/SH_long_mean_csr.npy')
        elif str.upper(source) == 'JPL':
            long_mean = pkg_resources.resource_filename('pyshbundle', 'data/RL06_long_mean/SH_long_mean_itsg.npy')
        elif str.upper(source) == 'ITSG':
            long_mean = pkg_resources.resource_filename('pyshbundle', 'data/RL06_long_mean/SH_long_mean_jpl.npy')
        else:
            raise Exception("Incorrect selection of source")
        print("Successfully loaded preloaded longterm means")
    else:
        print("Please download and provide the longterm GRACE SH mean values")
        print("Instructions to download the longterm GRACE SH mean values may be referred to in https://github.com/mn5hk/pyshbundle/blob/main/docs/index.md#how-to-download-data")
        long_mean = str(input("Enter the longterm mean for the SH values in the numpy (.npy) format"))
        print("Successfully loaded path to long term mean:", long_mean)

    return long_mean

`parse_jpl_file(file_path)` ¶

Reads the spherical harmonic data provided by JPL.

Parameters:

Name	Type	Description	Default
`file_path`	`str`	Absolute path to the file.	required

Returns:

Name	Type	Description
`tuple`		A tuple containing: - jpl_header (dict): Parsed header information. - jpl_data (dict): Extracted spherical harmonic coefficients data.

Source code in pyshbundle/io.py

def parse_jpl_file(file_path: str):
    """
    Reads the spherical harmonic data provided by JPL.

    Args:
        file_path (str): Absolute path to the file.

    Returns:
        tuple: A tuple containing:
            - jpl_header (dict): Parsed header information.
            - jpl_data (dict): Extracted spherical harmonic coefficients data.
    """
    # ensure that the file path is valid then proceed

    source = 'JPL'

    # check if the file is ziped or not

    if file_path[-3:] == '.gz':
        # open the file and read the lines
        with gzip.open(file_path, 'r') as file:

            # read the file line wise -> obtain a list of bytes
            info_lines = file.readlines()
            num_lines = len(info_lines)

            for i in range(len(info_lines)):
                # find the end of header sentence in the text file
                if str(info_lines[i]) == str(b'# End of YAML header\n',):
                    end_of_header_idx = i
                    break

        # everything after the header is the numerical data    
        header_info = info_lines[:end_of_header_idx]

        # parse the header strings to extract relavant metadata info
        jpl_header = parse_jpl_header(header_info)

        # parse the header strings to extract relavant metadata info
        jpl_data = extract_SH_data(file_path, source='itsg')

    return jpl_header, jpl_data

`read_GRACE_SH_paths(use_sample_files=0)` ¶

Returns path of data files, path of tn13 and path of tn14 replacement files.

Parameters:

Name	Type	Description	Default
`use_sample_files`	`int`	Defaults to 0.	`0`

Raises:

Type	Description
`Exception`	If the source selection is incorrect.

Returns:

Name	Type	Description
`tuple`		path_sh (str): Path of data files. path_tn13 (str): Path of tn13 replacement file. path_tn14 (str): Path of tn14 replacement file. source (str): Source of the SH files (JPL, ITSG, or CSR).

Remarks

The purpose of this script is to: - Read what the data source is (JPL, CSR, or ITSG). - Read file path for GRACE L2 spherical harmonics inputs. - Read replacement files for tn13 and tn14. - Identify the source of the SH files (JPL, ITSG, or CSR).

Source code in pyshbundle/io.py

def read_GRACE_SH_paths(use_sample_files = 0):
    """
    Returns path of data files, path of tn13 and path of tn14 replacement files.

    Args:
        use_sample_files (int, optional): Defaults to 0.

    Raises:
        Exception: If the source selection is incorrect.

    Returns:
        tuple: 
            - path_sh (str): Path of data files.
            - path_tn13 (str): Path of tn13 replacement file.
            - path_tn14 (str): Path of tn14 replacement file.
            - source (str): Source of the SH files (JPL, ITSG, or CSR).

    Remarks:
        The purpose of this script is to:
        - Read what the data source is (JPL, CSR, or ITSG).
        - Read file path for GRACE L2 spherical harmonics inputs.
        - Read replacement files for tn13 and tn14.
        - Identify the source of the SH files (JPL, ITSG, or CSR).
    """
    #Created on Fri Feb  17 2023
    #@author: Amin Shakya, Interdisciplinary Center for Water Research (ICWaR), Indian Institute of Science (IISc)

    print("This program supports working with GRACE L2 Spherical harmonics data from the following centers: CSR, JPL and ITSG")
    print("Instructions to download data may be referred to in https://github.com/mn5hk/pyshbundle/blob/main/docs/index.md#how-to-download-data")
    source = str(input("Enter the source of L2 SH coeffs code(jpl, csr, gfz): "))

    if use_sample_files ==1:

        print("You have chosen to use sample replacement files.")
        print("The replacement files for the TN13 and TN14 Args have been preloaded into the program")
        print("Due to the size of the GRACE SH files, these have not been preloaded into the program")
        print("You may download the GRACE SH L2 files from the link below. Please ensure to download the files as per your selection of source in the prior step")
        print("Download sample files from: https://github.com/mn5hk/pyshbundle/tree/main/sample_input_data")
    path_sh = str(input("Enter the path to the folder with SH L2 data"))


    if str.upper(source) == 'JPL':
        if use_sample_files == 1:
            path_tn13 = pkg_resources.resource_filename('pyshbundle', 'data/sample_JPL_TN_files/TN-13_GEOC_JPL_RL06.txt')
            path_tn14 = pkg_resources.resource_filename('pyshbundle', 'data/sample_JPL_TN_files/TN-14_C30_C20_GSFC_SLR.txt')
            print("Successfully loaded preloaded TN13 and TN14 replacement files for JPL")
        else:
            path_tn13 = str(input("Enter the path to the file for tn13 replacement in .txt format"))
            path_tn14 = str(input("Enter the path to the file for tn14 replacement in .txt format"))
            print("Successfully loaded TN13 and TN14 replacement files for JPL")

    elif str.upper(source) == 'CSR':
        if use_sample_files == 1:
            path_tn13 = pkg_resources.resource_filename('pyshbundle', 'data/sample_CSR_TN_files/TN-14_C30_C20_SLR_GSFC.txt')
            path_tn14 = pkg_resources.resource_filename('pyshbundle', 'data/sample_CSR_TN_files/TN-13_GEOC_CSR_RL06.1.txt')
            print("Successfully loaded preloaded TN13 and TN14 replacement files for CSR")
        else:
            path_tn13 = str(input("Enter the path to the file for tn13 replacement in .txt format"))
            path_tn14 = str(input("Enter the path to the file for tn14 replacement in .txt format"))
            print("Successfully loaded TN13 and TN14 replacement files for CSR")

    elif str.upper(source) == 'ITSG':
        if use_sample_files == 1:
            path_tn13 = pkg_resources.resource_filename('pyshbundle', 'data/sample_ITSG_TN_files/TN-13_GEOC_CSR_RL06.1.txt')
            path_tn14 = pkg_resources.resource_filename('pyshbundle', 'data/sample_ITSG_TN_files/TN-14_C30_C20_SLR_GSFC.txt')
            print("Successfully loaded preloaded TN13 and TN14 replacement files for ITSG")
        else:
            path_tn13 = str(input("Enter the path to the file for tn13 replacement in .txt format"))
            path_tn14 = str(input("Enter the path to the file for tn14 replacement in .txt format"))
            print("Successfully loaded TN13 and TN14 replacement files for ITSG")
    else:
        raise Exception("Source selection is incorrect. Please select between JPL, CSR or gfz")

    return path_sh, path_tn13, path_tn14, source

`replace_zonal_coeff(data_mat, source, lmax, data_tn13, data_tn14, epoch_begin, epoch_end)` ¶

Replaces the zonal coefficients in the given data matrix with the replacement coefficients from the provided TN-13 and TN-14 data.

Parameters:

Name	Type	Description	Default
`data_mat`	`ndarray`	The original data matrix containing spherical harmonic coefficients.	required
`source`	`str`	The source of the data ('jpl', 'csr', or 'itsg').	required
`lmax`	`int`	The maximum degree of the spherical harmonic expansion.	required
`data_tn13`	`ndarray`	The TN-13 replacement coefficients data.	required
`data_tn14`	`ndarray`	The TN-14 replacement coefficients data.	required
`epoch_begin`	`float`	The start date of the epoch in YYYYMMDD format.	required
`epoch_end`	`float`	The end date of the epoch in YYYYMMDD format. Defaults to None.	required

Returns:

Type	Description
`ndarray`	The data matrix with the zonal coefficients replaced.

Source code in pyshbundle/io.py

def replace_zonal_coeff(data_mat, source, lmax, data_tn13, data_tn14, epoch_begin: float, epoch_end: float):
    """
    Replaces the zonal coefficients in the given data matrix with the replacement coefficients 
    from the provided TN-13 and TN-14 data.

    Args:
        data_mat (numpy.ndarray): The original data matrix containing spherical harmonic coefficients.
        source (str): The source of the data ('jpl', 'csr', or 'itsg').
        lmax (int): The maximum degree of the spherical harmonic expansion.
        data_tn13 (numpy.ndarray): The TN-13 replacement coefficients data.
        data_tn14 (numpy.ndarray): The TN-14 replacement coefficients data.
        epoch_begin (float): The start date of the epoch in YYYYMMDD format.
        epoch_end (float, optional): The end date of the epoch in YYYYMMDD format. Defaults to None.

    Returns:
        (numpy.ndarray): The data matrix with the zonal coefficients replaced.
    """

    data_mat_copy = deepcopy(data_mat)

    if source == 'jpl':
        assert epoch_end is not None, "epoch_end argument cannot be None"
        # convert the float YYYYMMDD into datetime.date object
        epoch_begin = datetime.strptime(str(int(epoch_begin)), '%Y%m%d').date()
        epoch_end = datetime.strptime(str(int(epoch_end)), '%Y%m%d').date()

        # Extract the C10, C11, C20 and C30 from TN-13 and TN-14
        C10, C11 = extract_C10_11_replcmnt_coeff(
            data_tn13, 'jpl', epoch_begin, epoch_end)
        C20 = extract_C20_replcmnt_coeff(
            data_tn14, source, epoch_begin, epoch_end)
        C30 = extract_C30_replcmnt_coeff(
            data_tn14, source, epoch_begin, epoch_end)

        # For easy replacement purpose
        # [l, m, clm, slm, clm_dev, slm_dev]
        C00 = np.array([0, 0, 0, 0, 0, 0])

        # C30 is  at index - 3 in original matrix
        if C30 is not None:
            data_mat_copy[3, :] = C30

        # C20 is at index - 0 in original matrix
        data_mat_copy[0, :] = C20

        # stack the matrix row-wise
        data_mat_copy = np.row_stack([C11, data_mat_copy])
        data_mat_copy = np.row_stack([C10, data_mat_copy])
        data_mat_copy = np.row_stack([C00, data_mat_copy])

    elif source == 'csr':
        epoch_begin = datetime.strptime(str(int(epoch_begin)), '%Y%m%d').date()
        epoch_end = datetime.strptime(str(int(epoch_end)), '%Y%m%d').date()

        C10, C11 = extract_C10_11_replcmnt_coeff(
            data_tn13, 'csr', epoch_begin, epoch_end)

        C20 = extract_C20_replcmnt_coeff(
            data_tn14, 'csr', epoch_begin, epoch_end)
        C30 = extract_C30_replcmnt_coeff(
            data_tn14, 'csr', epoch_begin, epoch_end)

        # C10 is at index - 1
        # C20 is at index - 2
        # C30 is at index - 3
        # C11 is at index - lmax + 1
        data_mat_copy[lmax+1, :] = C11
        if C30 is not None:
            data_mat_copy[3, :] = C30        
        data_mat_copy[2, :] = C20
        data_mat_copy[1, :] = C10

    elif source == 'itsg':
        # the CSR dates are strings to begin with
        begin_date = datetime.strptime((epoch_begin), '%Y-%m').date()

        C10, C11 = extract_C10_11_replcmnt_coeff(
            data_tn13, 'itsg', epoch_begin=begin_date, epoch_end=None)

        print(C10, C11)

        C20 = extract_C20_replcmnt_coeff(
            data_tn14, 'itsg', epoch_begin=begin_date, epoch_end=None)
        C30 = extract_C30_replcmnt_coeff(
            data_tn14, 'itsg', epoch_begin=begin_date, epoch_end=None)

        # For easy replacement purpose
        # C10 is at index 1
        # C11 is at index 2
        # C20 is at index 3
        # C30 is at index 6
        if C30 is not None:
            data_mat_copy[6, :] = C30
        data_mat_copy[3, :] = C20
        data_mat_copy[2, :] = C11
        data_mat_copy[1, :] = C10

    return data_mat_copy

`sub2ind(array_shape, rows, cols)` ¶

Convert row and column subscripts to linear indices.

Parameters:

Name	Type	Description	Default
`array_shape`	`tuple`	Shape of the array as a tuple (num_rows, num_cols).	required
`rows`	`int or array - like`	Row indices.	required
`cols`	`int or array - like`	Column indices.	required

Returns:

Type	Description
	int or array-like: Linear indices corresponding to the row and column subscripts.

Source code in pyshbundle/io.py

def sub2ind(array_shape, rows, cols):
    """
    Convert row and column subscripts to linear indices.

    Args:
        array_shape (tuple): Shape of the array as a tuple (num_rows, num_cols).
        rows (int or array-like): Row indices.
        cols (int or array-like): Column indices.

    Returns:
        int or array-like: Linear indices corresponding to the row and column subscripts.
    """
    # rows, list need to be linear array
    return rows*array_shape[1] + cols

Computing Long Term Mean¶

Loads the long term mean values for the GRACE SH data.

Parameters:

Name	Type	Description	Default
`source`	`str`	Source of data. Defaults to "".	`''`
`use_sample_mean`	`int`	Whether to use default long-mean values provided with the data. Defaults to 0.	`0`

Raises:

Type	Description
`ValueError`	If the source selection is incorrect.

Returns:

Name	Type	Description
`str`		Path of the appropriate long term mean file.

Todo

Not sure if using "source = ''" is all right.
Instead of base exception, it can be ValueError.

Source code in pyshbundle/io.py

def load_longterm_mean(source = "", use_sample_mean = 0):
    """
    Loads the long term mean values for the GRACE SH data.

    Args:
        source (str, optional): Source of data. Defaults to "".
        use_sample_mean (int, optional): Whether to use default long-mean values provided with the data. Defaults to 0.

    Raises:
        ValueError: If the source selection is incorrect.

    Returns:
        str: Path of the appropriate long term mean file.

    Todo:
        + Not sure if using "source = ''" is all right.
        + Instead of base exception, it can be ValueError.
    """
# @author: Amin Shakya, Interdisciplinary Center for Water Research (ICWaR), Indian Institute of Science (IISc)

    if use_sample_mean == 1:
        print("Loading preloaded RL06 long term mean values")
        print("Please ensure that your data is RL06 \nIf not, please manually input long term mean by setting use_sample_mean = 0")

        if str.upper(source) == 'CSR':
            long_mean = pkg_resources.resource_filename('pyshbundle', 'data/RL06_long_mean/SH_long_mean_csr.npy')
        elif str.upper(source) == 'JPL':
            long_mean = pkg_resources.resource_filename('pyshbundle', 'data/RL06_long_mean/SH_long_mean_itsg.npy')
        elif str.upper(source) == 'ITSG':
            long_mean = pkg_resources.resource_filename('pyshbundle', 'data/RL06_long_mean/SH_long_mean_jpl.npy')
        else:
            raise Exception("Incorrect selection of source")
        print("Successfully loaded preloaded longterm means")
    else:
        print("Please download and provide the longterm GRACE SH mean values")
        print("Instructions to download the longterm GRACE SH mean values may be referred to in https://github.com/mn5hk/pyshbundle/blob/main/docs/index.md#how-to-download-data")
        long_mean = str(input("Enter the longterm mean for the SH values in the numpy (.npy) format"))
        print("Successfully loaded path to long term mean:", long_mean)

    return long_mean

Physical and Geodetic Constants¶

This script contains some of the major relavant Physical and Geodetic(GRS80) constants:

clight speed of light - $2.99792458e+8$ $m/s$
G Gravitational constant- $6.67259e-11$ $rac{m^3} {kg \cdot s^2}$
au astronomical unit - $149.597870691e+9$ $m$
ae semi-major axis of ellipsoid GRS 80- $6378137$ m
GM geocentric grav. constant GRS 80- $3.986005e+14$ $rac{m^3}{s^2}$
J2 earth's dynamic form factor GRS 80 - $1.08263e-3$ [unitless C20 unnormalized coefficient]
Omega mean ang. velocity GRS 80 - $7.292115e-5 $rac{rad}{s}$
flat flattening - $rac{1}{298.257222101}$

Load Data¶

¶

extract_SH_data(file_path, source) ¶

extract_deg1_coeff_tn13(file_path) ¶

extract_deg2_3_coeff_tn14(file_path) ¶

load_longterm_mean(source='', use_sample_mean=0) ¶

parse_jpl_file(file_path) ¶

read_GRACE_SH_paths(use_sample_files=0) ¶

replace_zonal_coeff(data_mat, source, lmax, data_tn13, data_tn14, epoch_begin, epoch_end) ¶

sub2ind(array_shape, rows, cols) ¶

Computing Long Term Mean¶

Physical and Geodetic Constants¶

`extract_SH_data(file_path, source)` ¶

`extract_deg1_coeff_tn13(file_path)` ¶

`extract_deg2_3_coeff_tn14(file_path)` ¶

`load_longterm_mean(source='', use_sample_mean=0)` ¶

`parse_jpl_file(file_path)` ¶

`read_GRACE_SH_paths(use_sample_files=0)` ¶

`replace_zonal_coeff(data_mat, source, lmax, data_tn13, data_tn14, epoch_begin, epoch_end)` ¶

`sub2ind(array_shape, rows, cols)` ¶