Utility Functions

Utility functions are defined to read and write to specific files.

The files can be imported via

from pysna.utils import export_to_json, export_to_csv, append_to_json, append_to_csv, load_from_json

or are included in the import-all-statement:

from pysna import *

Export to JSON

Function:

export_to_json(data: dict, export_path: str, encoding: str = 'utf-8', ensure_ascii: bool = False, *args)

Export dictionary data to JSON file. Function will add a data key for the JSON file and store the provided dictionary inside the data field.

Args:

data (dict): Data dictionary.
export_path (str): Export path including file name and extension.
encoding (str, optional): Encoding of JSON file. Defaults to UTF-8.
args (optional): Further arguments to be passed to json.dump().

References: https://docs.python.org/3/library/json.html

NOTE: When trying to export a dictionary containing tuples as keys, the function will try to serialize them by converting tuples to strings. Then, a tuple like ("WWU_Muenster", "goetheuni") will be encoded to: "__tuples__['WWU_Muenster', 'goetheuni']". For recovering the original dictionary after JSON export, use the load_from_json function.

Example:

# request results for Tweet comparison, return timestamp
results = api.compare_tweets([1612443577447026689, 1611301422364082180, 1612823288723476480],
                             compare=["common_liking_users"],
                             return_timestamp=True)
# export to JSON file
export_to_json(results, export_path="compare_tweets.json")

The exported compare_tweets.json file will the look like:

{
    "data": [
        {
            "common_liking_users": [
                3862364523
            ],
            "utc_timestamp": "2023-01-31 09:22:11.996652"
        }
}

Append to JSON

Function:

append_to_json(input_dict: Dict[str, Any], filepath: str, encoding: str = "utf-8", **kwargs)

Append a dictionary to an existing JSON file.
Existing JSON file needs a 'data' key.

Args:

input_dict: Dictionary containing new data that should be added to file.
filepath: Absolute or relative filepath including the file extension. Depending on the current working directory.
encoding: The encoding of the file. Defaults to UTF-8.
kwargs: Additional keyword arguments to be passed to json.dump() and json.load()

References: https://docs.python.org/3/library/json.html

Note: When trying to append a dictionary containing tuples as keys, the function will try to serialize them by converting tuples to strings. For recovering the original dictionary after JSON export, use the load_from_json function.

Example:

# generate new results that should be appended in the next step
new_results = api.compare_tweets([1612443577447026689, 1611301422364082180, 1612823288723476480],
                                 compare=["common_liking_users"],
                                 return_timestamp=True)

# append to an existing file.
append_to_json(new_results, "compare_tweets.json")

The extended compare_tweets.json file will be supplemented with one further entry within the data field. An example output could look like:

{
    "data": [
        {
            "common_liking_users": [
                3862364523
            ],
            "utc_timestamp": "2023-01-31 09:22:11.996652"
        },
        {
            "common_liking_users": [
                3862364523
            ],
            "utc_timestamp": "2023-01-31 09:23:05.848485"
        }
    ]
}

Load from JSON

Function:

load_from_json(filepath: str, encoding: str = "utf-8", **kwargs) -> dict

Load Python Dictionary from JSON file. Tuples are recovered.

Args:

filepath (str): Path to JSON file.
encoding (str, optional): Encoding of file. Defaults to UTF-8.
kwargs (optional): Keyword arguments to be passed to json.load().

Returns: Python Dictionary containing (deserialized) data from JSON file.

References: https://docs.python.org/3/library/json.html

NOTE: Tuples that have been encoded by the export_to_json function with a leading __tuples__ string will be recovered to original tuple representation. For instance, a encoded tuple __tuple__ ["WWU_Muenster", "goetheuni"] will be returned as ("WWU_Muenster", "goetheuni").

Example: Suppose an example.json file containing one entry with a serialized tuple key:

{
    "data": [
        {
            "__tuple__ ['WWU_Muenster', 'goetheuni']": 0.578077
        }
    ]
}

By calling:

from pysna.utils import load_from_json

data = load_from_json("example.json")
print(data)

the tuple will be recovered and a conventional Python Dictionary will be returned:

{("WWU_Muenster", "goetheuni"): 0.578077}

Export to CSV

Function:

export_to_csv(data: dict, export_path: str, encoding: str = "utf-8", sep: str = ",", **kwargs)

Export dictionary data to CSV file.
Will raise an exception if data dictionary contains nested dictionaries.

Args:

data (dict): Data dictionary
export_path (str): Exportpath including file name and extension.
encoding (str, optional): Encoding of CSV file. Defaults to UTF-8.
sep (str, optional): Value separator for CSV file. Defaults to ','.
kwargs (optional): Keyword arguments for pandas.DataFrame.to_csv.

References: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_csv.html

Example:

# request results for user information, return timestamp
results = api.user_info("WWU_Muenster",
                        ["id", "location", "friends_count", "followers_count", "last_active", "statuses_count"],
                        return_timestamp=True)
# export to CSV file
export_to_csv(results, export_path="user_info.csv")

Append to CSV

Function:

append_to_csv(data: dict, filepath: str, encoding: str = "utf-8", sep: str = ",")

Append a dictionary to an existing CSV file.
Will raise an exception if data dictionary contains nested dictionaries.

Args:

data (dict): Dictionary containing new data that should be added to file.
filepath (str): Absolute or relative filepath including the file extension. Depending on the current working directory.
encoding (str, optional): Encoding of CSV file. Defaults to UTF-8.
sep (str, optional): Value separator for CSV file. Defaults to ",".

References:

Example:

# request results for user information, return timestamp
results = api.user_info("WWU_Muenster",
                        ["id", "location", "friends_count", "followers_count", "last_active", "statuses_count"],
                        return_timestamp=True)
# export to CSV file
append_to_csv(results, filepath="user_info.csv")

Notes

Only JSON and CSV file formats are supported, yet.