From 556f2d0c0271a7595ec174c3bbe5f15638368069 Mon Sep 17 00:00:00 2001
From: Fiete Ostkamp <f.ostkamp@web.de>
Date: Sun, 30 May 2021 11:20:53 +0200
Subject: [PATCH 1/2] add index_to_name.json, update torch-model-archiver
 command to include the file

---
 README.md          | 15 ++++++++-------
 index_to_name.json | 28 ++++++++++++++++++++++++++++
 2 files changed, 36 insertions(+), 7 deletions(-)
 create mode 100644 index_to_name.json

diff --git a/README.md b/README.md
index b69aa12..946d02e 100644
--- a/README.md
+++ b/README.md
@@ -113,16 +113,17 @@ data = _service.preprocess(data)
 return data
     except Exception as e:
         raise e
-
-```
-TorcheServe uses a format called MAR (Model Archive). We can convert our PyTorch model to a .mar file using this command:
 ```
-torch-model-archiver --model-name "bert" --version 1.0 --serialized-file ./bert_model/pytorch_model.bin --extra-files "./bert_model/config.json,./bert_model/vocab.txt" --handler "./handler.py" 
+Create a new model directory:
+``` bash
+mkdir model_store
 ```
-Move the .mar file into a new directory: 
-```
-mkdir model_store && mv bert.mar model_store 
+TorcheServe uses a format called `MAR` (Model Archive). We can convert our PyTorch model to a `.mar` file using this command:
+``` bash
+torch-model-archiver --model-name "bert" --version 1.0 --serialized-file ./bert_model/pytorch_model.bin --extra-files "./bert_model/config.json,./bert_model/vocab.txt,./bert_model/index_to_name.json" --handler "./handler.py" --export-path "model_store/"
 ```
+The resulting mar file will be stored in the `model_store` directory we just created.
+
 Finally, we can start TorchServe using the command: 
 ```
 torchserve --start --model-store model_store --models bert=bert.mar
diff --git a/index_to_name.json b/index_to_name.json
new file mode 100644
index 0000000..089ef20
--- /dev/null
+++ b/index_to_name.json
@@ -0,0 +1,28 @@
+{
+    "0":"Business Ethics",
+    "1":"Data Security",
+    "2":"Access and Affordability",
+    "3":"Business Model Resilience",
+    "4":"Competitive Behavior",
+    "5":"Critical Incident Risk Management",
+    "6":"Customer Welfare",
+    "7":"Director Removal",
+    "8":"Employee Engagement Inclusion And Diversity",
+    "9":"Employee Health And Safety",
+    "10":"Human Rights And Community Relations",
+    "11":"Labor Practices",
+    "12":"Management Of Legal And Regulatory Framework",
+    "13":"Physical Impacts Of Climate Change",
+    "14":"Product Quality And Safety",
+    "15":"Product Design And Lifecycle Management",
+    "16":"Selling Practices And Product Labeling",
+    "17":"Supply Chain Management",
+    "18":"Systemic Risk Management",
+    "19":"Waste And Hazardous Materials Management",
+    "20":"Water And Wastewater Management",
+    "21":"Air Quality",
+    "22":"Customer Privacy",
+    "23":"Ecological Impacts",
+    "24":"Energy Management",
+    "25":"GHG Emissions"
+}
\ No newline at end of file

From 2f786a3074dc69a42af78ad2ad1f1cf2f10ad50d Mon Sep 17 00:00:00 2001
From: Fiete Ostkamp <f.ostkamp@web.de>
Date: Sun, 30 May 2021 11:22:42 +0200
Subject: [PATCH 2/2] structure the README with headings, highlight file names

---
 README.md | 73 ++++++++++++++++++++++---------------------------------
 1 file changed, 29 insertions(+), 44 deletions(-)

diff --git a/README.md b/README.md
index 946d02e..62d9b8f 100644
--- a/README.md
+++ b/README.md
@@ -5,32 +5,26 @@ Read more about this pre-trained model [here.](https://towardsdatascience.com/nl
 
 **In collaboration with [Charan Pothireddi](https://www.linkedin.com/in/sree-charan-pothireddi-6a0a3587/) and [Parabole.ai](https://www.linkedin.com/in/sree-charan-pothireddi-6a0a3587/)**
 
+## Prerequisites
 The further pre-trained ESG-BERT model can be found [here](https://drive.google.com/drive/folders/1yfNpMvByz3fJMsOqir3SerS6PwsRS2rt?usp=sharing) at this GitHub repository. It is a PyTorch model but it can be converted into a Tensorflow model. They can be fine-tuned using either framework. I found the PyTorch framework to be a lot cleaner, and easier to replicate with other models. However, serving the final fine-tuned model is a lot easier on TensorFlow, than on PyTorch. 
 
-You can download the ESG-BERT model (named pytorch_model.bin) along with config.json and vovab.txt fles here. BERT base model was further pre-trained on Sustainable Investing text corpus, resulting in a domain specific model. You need the all of those 3 files for fine-tuning.  
+You can download the ESG-BERT model (named `pytorch_model.bin`) along with `config.json` and `vocab.txt` files here. The BERT base model was further pre-trained on Sustainable Investing text corpus, resulting in a domain specific model. You need the all of those 3 files for fine-tuning.  
 
-For fine-tuning the model, you can use this command to load it into PyTorch. 
-```
-model = BertForSequenceClassification.from_pretrained(
-    'path/to/dir/containing/ESG-BERT', 
-    num_labels = num, #number of classifications
-   output_attentions = False, # Whether the model returns attentions weights.
-    output_hidden_states = False, # Whether the model returns all hidden-states.
-)
-model.to(device)
-
-```
 The fine-tuned model for text classification is also available [here](https://drive.google.com/drive/folders/1Qz4HP3xkjLfJ6DGCFNeJ7GmcPq65_HVe?usp=sharing). It can be used directly to make predictions using just a few steps. 
-First, download the fine-tuned pytorch_model.bin, config.json, and vocab.txt into your local directory. Make sure to place all of them into the same directory, mine is called "bert_model". 
+First, download the fine-tuned `pytorch_model.bin`, `config.json`, and `vocab.txt` into your local directory. Make sure to place all of them into the same directory, mine is called `bert_model`.
+
+### Install dependencies
 JDK 11 is needed to serve the model. Go ahead and install it from the Oracle downloads page. Now we are ready to set up TorcheServe.
 TorchServe is a model serving architecture for PyTorch models, go ahead and install that using pip. You can also use conda for the installation. We also need pytorch and transformers installed. 
-```
+``` bash
 pip install torchserve torch-model-archiver
 pip install torchvision 
 pip install transformers
 ```
-Next up, we'll set up the handler script. It is a basic handler for text classification that can be improved upon. Save this script as "handler.py" in your directory. [1]
-```
+
+### Set up the handler script
+Next up, we'll set up the handler script. It is a basic handler for text classification that can be improved upon. Save this script as `handler.py` in your directory. [1]
+``` python
 from abc import ABC
 import json
 import logging
@@ -113,7 +107,9 @@ data = _service.preprocess(data)
 return data
     except Exception as e:
         raise e
+
 ```
+## Creating a torchserve model archive
 Create a new model directory:
 ``` bash
 mkdir model_store
@@ -124,44 +120,33 @@ torch-model-archiver --model-name "bert" --version 1.0 --serialized-file ./bert_
 ```
 The resulting mar file will be stored in the `model_store` directory we just created.
 
+## Serve the model
 Finally, we can start TorchServe using the command: 
 ```
 torchserve --start --model-store model_store --models bert=bert.mar
 ```
+
+## Test the model
 We can now query the model from another terminal window using the Inference API. We pass a text file containing text that the model will try to classify. 
 
 ```
 curl -X POST http://127.0.0.1:8080/predictions/bert -T predict.txt
 ```
-This returns a label number which correlates to a textual label. This is stored in the label_dict.txt dictionary file. 
-```
-__label__Business_Ethics :  0
-__label__Data_Security :  1
-__label__Access_And_Affordability :  2
-__label__Business_Model_Resilience :  3
-__label__Competitive_Behavior :  4
-__label__Critical_Incident_Risk_Management :  5
-__label__Customer_Welfare :  6
-__label__Director_Removal :  7
-__label__Employee_Engagement_Inclusion_And_Diversity :  8
-__label__Employee_Health_And_Safety :  9
-__label__Human_Rights_And_Community_Relations :  10
-__label__Labor_Practices :  11
-__label__Management_Of_Legal_And_Regulatory_Framework :  12
-__label__Physical_Impacts_Of_Climate_Change :  13
-__label__Product_Quality_And_Safety :  14
-__label__Product_Design_And_Lifecycle_Management :  15
-__label__Selling_Practices_And_Product_Labeling :  16
-__label__Supply_Chain_Management :  17
-__label__Systemic_Risk_Management :  18
-__label__Waste_And_Hazardous_Materials_Management :  19
-__label__Water_And_Wastewater_Management :  20
-__label__Air_Quality :  21
-__label__Customer_Privacy :  22
-__label__Ecological_Impacts :  23
-__label__Energy_Management :  24
-__label__GHG_Emissions :  25
+This returns a textual label, defined in the `index_to_name.json` file. 
+
+## Fine-tuning the model yourself
+For fine-tuning the model, you can use this command to load it into PyTorch. 
+``` python
+model = BertForSequenceClassification.from_pretrained(
+    'path/to/dir/containing/ESG-BERT', 
+    num_labels = num, #number of classifications
+   output_attentions = False, # Whether the model returns attentions weights.
+    output_hidden_states = False, # Whether the model returns all hidden-states.
+)
+model.to(device)
 ```
+
+
 References:
 [1] - ---