Ethereum is great for data enthusiasts. As a blockchain, it provides us with lots of well-structured data, making virtually everything that happens on it available to analysis.

Blockchain ETL has for a long time been the leading open-source project for parsing Ethereum data (600+ stars on Github).

My fellow D5 DAO member Evgeny Medvedev also worked with Google in 2018 to make Ethereum datasets available via BigQuery.

In this post, I’ll show you how you can easily contribute smart-contract parsers to Blockchain ETL in 3 simple steps. You don’t even need to write any code!

At the end of this process, you’ll have tables for the smart contracts that you contributed available in BigQuery here. As an example, a query for all Compound cBAT mints would be as easy as:

SELECT * FROM `blockchain-etl.ethereum_compound.cBAT_event_Mint`

The whole process can be broken down in three steps:

Use the Contract Parser tool to generate table definition files Create a pull request in etherum-etl-airflow Request a review and wait for approval

Now, let’s go through each step in detail.

1. Use the Contract Parser tool

(1) First, head over to this Contract Parser tool and paste in a smart contract address of your choice. It’ll look something like this:

Hit submit, and some magic happens under the hood. You’ll see a new box:

In this case, we found 4 events in this Chainlink Oracle contract.

(UPDATE: Added instructions for proxy contracts in this post.)

(2) Now, before downloading the table definitions, type in the name of your dataset. Some example datasets we already have are:

kyber (for Kyber Network)

(for Kyber Network) zeroex (for 0x Project)

(for 0x Project) compound (for Compound)

(for Compound) ens (for Ethereum Name Service)

Stick to lowercase and _-separated names to keep things consistent.

In this case, let’s say we name a new dataset chainlink :

Getting ready to parse some Chainlink data!

(3) Click “Download Table Definitions”. This will download 1 JSON file per event in the smart contract. Each file contains a description that lets us generate the parser for that event in BigQuery.

2. Create a pull request in etherum-etl-airflow

(1) Assuming you have Git installed, fork the etherum-etl-airflow repository, and then clone it to your local machine:

(2) Create a new branch, e.g. in our case:

git checkout -b feature/chainlink-events

(3) Now paste the table definition file you downloaded above into a new folder, with the name of your dataset (in our case: “chainlink”).

(4) Move this folder into the following path in the repository above:

dags/resources/stages/parse/table_definitions/

You’ll see our existing datasets there (airswap, compound, ens, etc).

(Optional): It would be fantastic if you add a sentence in each file to document what this event represents. The field table_description inside each JSON is the correct place to add this sentence.

(5) Git add, commit, and push your changes:

git add dags/resources/stages/parse/table_definitions/

git commit -m "Added Chainlink events"

git push --set-upstream origin feature/chainlink-events

Head to Github to complete the pull request.

3. Request a review and wait for approval

(1) Leave a comment on your PR saying that you used the Contract Parser:

“Used the Contract Parser: https://contract-parser.d5.ai”

(2) Request a review from Evgeny: medvedev1088 on Github

(3) Wait for approval and deployment!