From 406982260f3b131f2a94aa3a52bb8234a1974311 Mon Sep 17 00:00:00 2001
From: Lysandre <lysandre@huggingface.co>
Date: Wed, 13 Jan 2021 14:18:35 +0000
Subject: [PATCH] Update dimensions

---
 README.md | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/README.md b/README.md
index 61a7db6..05f22c5 100644
--- a/README.md
+++ b/README.md
@@ -42,6 +42,13 @@ This way, the model learns an inner representation of the English language that
 useful for downstream tasks: if you have a dataset of labeled sentences for instance, you can train a standard
 classifier using the features produced by the BERT model as inputs.
 
+This model has the following configuration:
+
+- 24-layer
+- 1024 hidden dimension
+- 16 attention heads
+- 336M parameters.
+
 ## Intended uses & limitations
 This model should be used as a question-answering model. You may use it in a question answering pipeline, or use it to output raw results given a query and a context. You may see other use cases in the [task summary](https://huggingface.co/transformers/task_summary.html#extractive-question-answering) of the transformers documentation.## Training data