Parens for Python - Sci SpaCy

NLP for scientific text

We are going to explore some more Python libraries through the use of libpython-clj.

This time, we are going to look at Sci SpaCy

{ :deps { org.clojure/clojure { :mvn/version "1.10.1" } clj-python/libpython-clj { :mvn/version "1.36" } } } deps.edn Clojure

Install the python dependencies and model

pip3 install spacy scispacy pip3 install https : / /s3-us-west-2 .amazonaws .com /ai2-s2-scispacy /releases /v0 .2 .4 /en_core_sci_sm-0 .2 .4 .tar .gz 34.3s Clj & Python env (Bash in Clojure) Clojure

We are going to be following the tutorial from https://allenai.github.io/scispacy/

Load up the model and analyze

The first thing we need to do is to load up the namespace, and model

( ns gigasquid.sci-spacy ( :require [ libpython-clj.require :refer [ require-python ] ] [ libpython-clj.python :as py :refer [ py. py.. py.- ] ] ) ) ( require-python [ spacy :as spacy ] ) ( require-python [ scispacy :as scispacy ] ) ( def nlp ( spacy/load "en_core_sci_sm" ) ) 17.0s Clj & Python env (Clojure) Clojure gigasquid.sci-spacy/nlp

Now, we are ready to analyze some text:

( def text "Myeloid derived suppressor cells (MDSC) are immature myeloid cells with immunosuppressive activity. They accumulate in tumor-bearing mice and humans with different types of cancer, including hepatocellular carcinoma (HCC)." ) ( def doc ( nlp text ) ) 0.1s Clj & Python env (Clojure) Clojure gigasquid.sci-spacy/doc

Let's find all the entities.

( map ( fn [ ent ] ( py.- ent text ) ) ( py.- doc ents ) ) 0.0s Clj & Python env (Clojure) Clojure List ( 12 ) List " Myeloid " , " suppressor cells " , " MDSC " , " immature " , " myeloid cells " , " immunosuppressive activity " , " accumulate " , " tumor-bearing mice " , " humans " , " cancer " , " hepatocellular carcinoma " , " HCC " )

The same with the sentences.

( map ( fn [ sent ] ( py.- sent text ) ) ( py.- doc sents ) ) 0.0s Clj & Python env (Clojure) Clojure List ( 2 ) List " Myeloid derived suppressor cells (MDSC) are immature myeloid cells with immunosuppressive activity. " , " They accumulate in tumor-bearing mice and humans with different types of cancer, including hepatocellular carcinoma (HCC). " )

We can even graph things!

( require-python [ spacy.displacy :as displacy ] ) ( spit "results/my-pic.svg" ( displacy/render ( first ( py.- doc sents ) ) :style "dep" ) ) 0.0s Clj & Python env (Clojure) Clojure

Want more examples? Check them out here: https://github.com/gigasquid/libpython-clj-examples