Sure, it’s only 320 records, which isn’t really that much information to look at and traverse, but if we were to load this data into Kibana, we’d be able to explore this information much more intuitively and quickly using its tools.

So let’s get these CSV records into our Elasticsearch instance. We’ll be using Python for this, but any scripting language is suitable for this use case.

Our Python script will be a lot easier to write if we use the client provided by Elastic, which can be installed with pip .

$ pip3 install elasticsearch

Here’s what the Python script will look like:

from elasticsearch import helpers, Elasticsearch

import csv



es = Elasticsearch()



with open('./2010_Census_Populations_by_Zip_Code.csv') as f:

index_name = 'census_data_records'

doctype = 'census_record'

reader = csv.reader(f)

headers = []

index = 0

es.indices.delete(index=index_name, ignore=[400, 404])

es.indices.create(index=index_name, ignore=400)

es.indices.put_mapping(

index=index_name,

doc_type=doctype,

ignore=400,

body={

doctype: {

"properties": {

"Zip Code": {

"type": "float"

},

"Total Population": {

"type": "float",

},

"Median Age": {

"type": "float"

},

"Total Males": {

"type": "float",

},

"Total Females": {

"type": "float",

},

"Total Households": {

"type": "float",

},

"Average Household Size": {

"type" :"float"

}

}

}

}

)

for row in reader:

try:

if(index == 0):

headers = row

else:

obj = {}

for i, val in enumerate(row):

obj[headers[i]] = float(val)

print(obj)

# put document into elastic search

es.index(index=index_name, doc_type=doctype, body=obj)

print(obj)



except Exception as e:

print('error: ' + str(e) + ' in' + str(index))

index = index + 1



f.close()

The CSV file should be in the same directory as the script. If you’re using a different CSV data source, you should also change the script to match the name of that file.

Let’s explain what the script does.

In Elasticsearch, an index can be thought of like a database in a traditional relational database. Documents, which are the objects of data, are indexed , meaning that they are stored and made searchable by that index. Doc_type is not necessarily required, as it’s defaulted to _ doc ; however, defining it will allow us to keep our schema more easily maintainable.

What the lines of code above do is take all the data from the CSV file, and import it into Elasticsearch record by record. The headers of the CSV file are used for the object schema when importing into Elasticsearch, and all the values of the CSV data are converted to floats (just as a general catch-all for safety and simplicity for now, as most of the data only needs to be in integers).

Before we save the CSV records though, we have to map the Elasticsearch index (with es.put_mapping ) so our data attributes match their desired variable types. Otherwise, the data wouldn’t be stored as numerical values within Elasticsearch, and we wouldn’t be able to perform calculations and statistical analysis on it in Kibana.