Python

You Must Know Python JSON Dumps, But Maybe Not All Aspects

Pinterest LinkedIn Tumblr

Some tricks about the dump(s) method of the Python JSON module

Python has built-in support to JSON documents, by its “json” module. I bet most of us have used it, and some of us have used it a lot. We know that we can use the json.dumps() method to easily convert a Python dictionary object into a JSON string. However, this method is not that simple as most developers thought.

In this article, I’ll introduce this json.dumps() method by examples. The complexity and rarity of usage are from shallow to deep.

Now we should begin. No need to download anything, your Python 3.x must come with the JSON module.

import json

Table of Content

1. Auto-Indentation for Pretty Output
2. Customised Separators
3. Sort Keys
4. Skip Non-Basic Key Types
5. Non-ASCII Characters
6. Circular Check
7. Allow NaN (Not a Number)
8. Customised JSON Encoder

0. Differences between “dump” and “dumps”

In case some newbies are reading this article, it is probably necessary to mention that there are two similar methods in the JSON module — dump() and dumps().

These two methods have almost identical signatures, but dump() writes the converted JSON string to a stream (usually a file), whereas dumps() convert the dictionary into a JSON string only.

Suppose we have a simple dictionary as follows.

my_dict = {
'name': 'Chris',
'age': 33
}

If we want to convert it to a formatted JSON string and write it to a file, we can use dumps().

with open('my.json', 'w') as f:
json.dump(my_dict, f)

Then, if we check the working directory, the file my.json should be there with the converted JSON string.

If we use dumps(), it simply dumps the dictionary into a string.

json.dumps(my_dict)

Since these two methods have the same signatures, this article will be only focused on the dumps(), because all these tricks will work for the other one.

1. Auto-Indentation for Pretty Output

See that string output by the dumps() method? It is not ideal for reading. If we have a really large JSON document, everything will be output in a single line.

In order to output the JSON string with the pretty format, we can easily add a parameter “indent”. It takes an integer number as the argument.

json.dumps(my_dict, indent=2)
# OR
json.dumps(my_dict, indent=4)

2. Customised Separators

JSON follows a pretty strict format, for example, the items at the same level must be separated by a comma, and a semi-colon must be used between a key and its value. Therefore, by default, the item separator and the key separator will be and .

However, if we want to output a compact JSON string, we can change these separators by the parameter separators. We must pass both the two separators in a tuple.

json.dumps(my_dict, separators=(',', ':'))

If we don’t have to use the JSON string as JSON, we can also modify the separators to whatever we want. For example, we can let it become the PHP style as follows.

json.dumps(
my_dict,
separators=('', ' => '),
indent=2
)

3. Sort Keys

JSON usually doesn’t care about the order of the items. Therefore, when we dump a Python dictionary to a JSON string, the order of the items will be kept as-is.

json.dumps({
'c': 1,
'b': 2,
'a': 3
})

However, if we do want to sort the converted JSON string by the item keys, we can easily set sort_keys parameter to True.

json.dumps({
'c': 1,
'b': 2,
'a': 3
}, sort_keys=True)

The key will be sorted alphabetically.

4. Skip Non-Basic Key Types

JSON only supports several types of objects as item keys, which are strintfloatbool and None. These types are called basic types. If we try to convert a dictionary with a non-basic type of key, a TypeError will be thrown.

json.dumps({
'name': 'Chris',
(1,2): 'I am a tuple'
})

That makes sense because JSON doesn’t support a collection type as keys. However, if we want to skip these types which don’t make sense anyway, we can set the skip_keys to true to suppress these items.

json.dumps({
'name': 'Chris',
(1,2): 'I am a list'
}, skipkeys=True)

5. Non-ASCII Characters

Non-ASCII characters cannot be guaranteed displayed well on all the platforms, and may also create troubles during transferring the JSON string. Therefore, when converting them into a JSON string, Python will encode it as follows.

json.dumps({
'name': 'Chris',
'desc': 'There is a special char -> ¢'
})

However, if we don’t want to encode these special characters, we can set ensure_ascii to false.

json.dumps({
'name': 'Chris',
'desc': 'There is a special char -> ¢'
}, ensure_ascii=False)

6. Circular Check

In Python, it is possible to define a dictionary with circular reference. For example, let’s define a dictionary with a key called “dictionary”, and the value is None for now.

my_dict = {
'dictionary': None
}

Then, let’s assign the dictionary itself as the value of the key “dictionary”.

my_dict['dictionary'] = my_dict

Now, if we try to output this dictionary, Python will display ... because there is a circular reference.

If we try to dump the JSON string from such a dictionary, the circular reference will be detected and throw an error.

However, if we don’t want to detect the circular reference and just let it go, we can set the parameter check_circular to false.

json.dumps(my_dict, check_circular=False)

The only difference is that the latter will really try to dump the circular referenced dictionary level by level until it goes overflow. There is perhaps no benefit to doing so. Even if you want to use this feature to achieve something else (which I couldn’t find an example…), there must be a better way than this.

7. Allow NaN (Not a Number)

By default, when converting a dictionary with an invalid number such as naninf and -inf, it will leave it as-is.

import numpy as npjson.dumps({
'my_number': np.nan
})

This will cause some problems later on if we try to use the “JSON string” for other purposes because it is not a valid JSON string anymore. NaN is not a valid JSON value type.

If we want to avoid this, or at least let the problem reveal earlier, we can set the parameter allow_nan to false. When there is a NaN, an error will be thrown.

json.dumps({
'my_number': np.nan
}, allow_nan=False)

8. Customised JSON Encoder

Last but not least, let’s have a look at how to customise the JSON encoder. Rather than let everything goes default, we can tweak the behaviour of JSON dumps() method easily.

For example, we have such a dictionary as follows. It has a datetime object as the value in an item.

from datetime import datetimemy_dict = {
'alarm_name': 'get up',
'alarm_time': datetime(2021, 12, 3, 7)
}

In this case, if we try to dump the dictionary into JSON, a Type Error will be thrown.

What if we want to parse datetime objects when dumping the dictionary? We can create a subclass inherit the class json.JSONEncoder, and then implement the default method.

class DateTimeEncoder(json.JSONEncoder):
def default(self, obj):
if isinstance(obj, datetime):
return obj.strftime('%Y-%m-%d %H:%M:%S')
return json.JSONEncoder.default(self, obj)

In the above code, we check if the object is a datetime object. If so, we convert the datetime into a string and then return it. Otherwise, just use the JSONEncoder.default() method.

Once we got this customised encoder class, we can pass it to the cls parameter in the dumps method. Its behaviour will be changed.

json.dumps(my_dict, cls=DateTimeEncoder, indent=2)

Summary

Image by Alexandr Podvalny from Pixabay

In this article, I have introduced only one method — json.dumps() in Python. For a built-in method like this one, we usually use them a lot, but we may not know every single aspect of it. It is highly recommended to investigate the methods that we are familiar with rather than keep looking for those that are rarely used, because the former will perhaps give us more value to the time we spent.

Original Source