Chat with the document, response in server-sent events mode by default, you can change it through stream
parameter.
Consume one question quota whether it's upon single document or collection.
Response:
if document is pdf/doc/docx format, structure is as follows:
{
"data": {
"answer": "answer to the question",
"id": question_id,
"source_info": [
{
# key: page number
# value: rects
'0': [[38.1063, 557.8058, 553.9003, 584.0043]]
},
{'1': [[38.0, 152.3994, 523.6151, 178.6392]], 'upload_id': 'xxxx'},
{'0': [[38.0, 758.0623, 537.0082, 784.0]], 'upload_id': 'xxxx'},
...
]
}
}
if document is md/epub/txt/website format, structure is as follows:
{
"data": {
"answer": "answer to the question",
"id": question_id,
"source_info": [
{
# key: element data-index
# value: element xpath
198: [{xpath: "/html/body/div/div[199]"}]
},
{
"material": "", # selected text with HTML tags
"indexes": [
3,
4,
5,
6
],
"focusNode": "div[1]/p[5]/text()[1]",
"upload_id": "903d971d-8250-47cc-a649-4b6ca35032dc",
"anchorNode": "div[1]/p[2]/text()[1]",
"focusOffset": "225",
"anchorOffset": "2"
}
...
]
}
}
-
answer
: chunks of answer, may be Markdown format to support rich text, for example: Tables. Fordetailed_citation
answer, the span tag chunk may be as follows:autobiography captured the pre-Nazi Europe[<span data-index="0">1</span>]
The tag attr
data-index
is the index of the source insource_info
array, which is used to highlight the source of the previous answer sentences in your PDF viewer. The highlighting method is same assource_info
, and just use slice ofsource_info
array as parameter. -
id
: id of the question, you can use it toGET /questions/{question_id}
later.
Please note: you should store id
in your database, because we don't have a GET /questions/list
API to list all questions for now.
source_info
: only responses in the last chunk of server-sent events mode, may be an empty list. So if the last chunk doesn't containsource_info
attr, it means error occurred. Page number may not be ordered, you can use this information to highlight the source of specificupload_id
document of the answer in your PDF viewer, by
callingdrawSources
method of our DOCViewerSDK, and convertingsource_info
toSource
parameter.