Utility Functions
Utility functions are defined to read and write to specific files.
The files can be imported via
from pysna.utils import export_to_json, export_to_csv, append_to_json, append_to_csv, load_from_json
or are included in the import-all-statement:
from pysna import *
Export to JSON
Function:
export_to_json(data: dict, export_path: str, encoding: str = 'utf-8', ensure_ascii: bool = False, *args)
Export dictionary data to JSON file.
Function will add a data
key for the JSON file and store the provided dictionary inside the data
field.
Args:
data
(dict): Data dictionary.export_path
(str): Export path including file name and extension.encoding
(str, optional): Encoding of JSON file. Defaults to UTF-8.args
(optional): Further arguments to be passed tojson.dump()
.
References: https://docs.python.org/3/library/json.html
NOTE: When trying to export a dictionary containing tuples as keys, the function will try to serialize them by converting tuples to strings. Then, a tuple like ("WWU_Muenster", "goetheuni")
will be encoded to: "__tuples__['WWU_Muenster', 'goetheuni']"
. For recovering the original dictionary after JSON export, use the load_from_json
function.
Example:
# request results for Tweet comparison, return timestamp
results = api.compare_tweets([1612443577447026689, 1611301422364082180, 1612823288723476480],
compare=["common_liking_users"],
return_timestamp=True)
# export to JSON file
export_to_json(results, export_path="compare_tweets.json")
The exported compare_tweets.json
file will the look like:
{
"data": [
{
"common_liking_users": [
3862364523
],
"utc_timestamp": "2023-01-31 09:22:11.996652"
}
}
Append to JSON
Function:
append_to_json(input_dict: Dict[str, Any], filepath: str, encoding: str = "utf-8", **kwargs)
Append a dictionary to an existing JSON file.
Existing JSON file needs a 'data' key.
Args:
input_dict
: Dictionary containing new data that should be added to file.filepath
: Absolute or relative filepath including the file extension. Depending on the current working directory.encoding
: The encoding of the file. Defaults to UTF-8.kwargs
: Additional keyword arguments to be passed tojson.dump()
andjson.load()
References: https://docs.python.org/3/library/json.html
Note: When trying to append a dictionary containing tuples as keys, the function will try to serialize them by converting tuples to strings. For recovering the original dictionary after JSON export, use the load_from_json
function.
Example:
# generate new results that should be appended in the next step
new_results = api.compare_tweets([1612443577447026689, 1611301422364082180, 1612823288723476480],
compare=["common_liking_users"],
return_timestamp=True)
# append to an existing file.
append_to_json(new_results, "compare_tweets.json")
The extended compare_tweets.json
file will be supplemented with one further entry within the data
field. An example output could look like:
{
"data": [
{
"common_liking_users": [
3862364523
],
"utc_timestamp": "2023-01-31 09:22:11.996652"
},
{
"common_liking_users": [
3862364523
],
"utc_timestamp": "2023-01-31 09:23:05.848485"
}
]
}
Load from JSON
Function:
load_from_json(filepath: str, encoding: str = "utf-8", **kwargs) -> dict
Load Python Dictionary from JSON file. Tuples are recovered.
Args:
filepath
(str): Path to JSON file.encoding
(str, optional): Encoding of file. Defaults to UTF-8.kwargs
(optional): Keyword arguments to be passed tojson.load()
.
Returns: Python Dictionary containing (deserialized) data from JSON file.
References: https://docs.python.org/3/library/json.html
NOTE: Tuples that have been encoded by the export_to_json
function with a leading __tuples__
string will be recovered to original tuple representation. For instance, a encoded tuple __tuple__ ["WWU_Muenster", "goetheuni"]
will be returned as ("WWU_Muenster", "goetheuni")
.
Example:
Suppose an example.json
file containing one entry with a serialized tuple key:
{
"data": [
{
"__tuple__ ['WWU_Muenster', 'goetheuni']": 0.578077
}
]
}
By calling:
from pysna.utils import load_from_json
data = load_from_json("example.json")
print(data)
the tuple will be recovered and a conventional Python Dictionary will be returned:
{("WWU_Muenster", "goetheuni"): 0.578077}
Export to CSV
Function:
export_to_csv(data: dict, export_path: str, encoding: str = "utf-8", sep: str = ",", **kwargs)
Export dictionary data to CSV file.
Will raise an exception if data
dictionary contains nested dictionaries.
Args:
data
(dict): Data dictionaryexport_path
(str): Exportpath including file name and extension.encoding
(str, optional): Encoding of CSV file. Defaults to UTF-8.sep
(str, optional): Value separator for CSV file. Defaults to','
.kwargs
(optional): Keyword arguments forpandas.DataFrame.to_csv
.
References: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_csv.html
Example:
# request results for user information, return timestamp
results = api.user_info("WWU_Muenster",
["id", "location", "friends_count", "followers_count", "last_active", "statuses_count"],
return_timestamp=True)
# export to CSV file
export_to_csv(results, export_path="user_info.csv")
Append to CSV
Function:
append_to_csv(data: dict, filepath: str, encoding: str = "utf-8", sep: str = ",")
Append a dictionary to an existing CSV file.
Will raise an exception if data
dictionary contains nested dictionaries.
Args:
data
(dict): Dictionary containing new data that should be added to file.filepath
(str): Absolute or relative filepath including the file extension. Depending on the current working directory.encoding
(str, optional): Encoding of CSV file. Defaults to UTF-8.sep
(str, optional): Value separator for CSV file. Defaults to ",".
References:
- https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html
- https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_csv.html
Example:
# request results for user information, return timestamp
results = api.user_info("WWU_Muenster",
["id", "location", "friends_count", "followers_count", "last_active", "statuses_count"],
return_timestamp=True)
# export to CSV file
append_to_csv(results, filepath="user_info.csv")
Notes
- Only JSON and CSV file formats are supported, yet.