國家寶藏松 - 後端需求

編輯歷史

時間	作者	版本
2017-08-09 19:44 – 19:44	Hsin Hsiao	r1154 – r1157
顯示 diff （16 行未修改） createdAt updatedAt + completedAt status: enum['dispatched', 'complete', 'incomplete', 'error'] （49 行未修改）
2017-08-09 18:34 – 18:41	Hsin Hsiao	r976 – r1153
顯示 diff （8 行未修改） If everything above checks out, check our flag with `unstarted` `started but incomplete` `complete` dispatch unstarted and incomplete ones to volunteer client + + NoSQL Table schema for TNT-Dispatch + uid: unique id for the dispatch + catalogId: unique id of the record in the TNT-Catalog + userId: unique id of the requesting user/volunteer + naId: associated naId of the record + createdAt + updatedAt + status: enum['dispatched', 'complete', 'incomplete', 'error'] + + Endpoints of the Dispatch Server + Request a dispatch: [Endpoint]/request + *Update a dispatch status: [Endpoint]/update （44 行未修改）
2017-08-06 19:27	Hsin Hsiao	r975
顯示 diff （56 行未修改）
2017-01-10 02:44 – 03:54	Hsin Hsiao	r502 – r974
顯示 diff 國家寶藏松 - 後端需求 + + 志工發配 server + + Given a naId (every document has a unique naId) query its detail here https://catalog.archives.gov/api/v1/?naIds={naId} + check if `results.result[0].objects` exists; if yes, it has already been digitized and the image files are under `results.result[0].objects.object` array + if it hasn't been digitized, check if `results.result[0].description.series.fileUnit` exists and has a number > 1. If yes, that means there are multiple files under this naId and you need to query `https://catalog.archives.gov/api/v1/?description.fileUnit.parentSeries.naId={naId}` to get the sub files and their naIds. + Next, need to check `accessRestriction.status.termName` under `description` or `description.fileUnit` . If it's `restricted` then we don't want to dispatch this naId to volunteer. + If everything above checks out, check our flag with `unstarted` `started but incomplete` `complete` + *dispatch unstarted and incomplete ones to volunteer client + Document Index Server （43 行未修改）
2016-12-11 02:04 – 02:36	Hsin Hsiao	r471 – r501
顯示 diff （35 行未修改） .2 16 更新 + OCR server is up and running at https://nationa-treasure-vision.herokuapp.com/vision + + Source Code + https://github.com/national-treasures-tw/vision test by posting with { types: 'text', imageUrl: 'YOU_IMAGE_URL' } + please do not abuse this as we have only 1000 free request quota
2016-12-10 19:02 – 19:42	Hsin Hsiao	r366 – r470
顯示 diff （31 行未修改）網站要工程師嗎？ + + 12.1 + .2 + 16 更新 + OCR server is up and running at + https://nationa-treasure-vision.herokuapp.com/vision + + test by posting with { types: 'text', imageUrl: 'YOU_IMAGE_URL' }
2016-11-15 21:50 – 21:50	Hsin Hsiao	r364 – r365
顯示 diff （2 行未修改） Document Index Server + NARA API Github (query example) Nation Archive Api recorder (first 100 row only out of 12600) https://github.com/hsin421/tw-national-treasure （26 行未修改）
2016-11-15 20:23 – 20:24	Hsin Hsiao	r303 – r363
顯示 diff （5 行未修改） https://github.com/hsin421/tw-national-treasure + Suggested by Simon Liu + Digital archive management system (open source): Fedora commons OCR SERVER （21 行未修改）
2016-11-08 16:25 – 16:25	Hsin Hsiao	r288 – r302
顯示 diff 國家寶藏松 - 後端需求 + + Document Index Server + + Nation Archive Api recorder (first 100 row only out of 12600) + https://github.com/hsin421/tw-national-treasure + OCR SERVER （21 行未修改）
2016-11-07 15:31 – 15:32	Hsin Hsiao	r285 – r287
顯示 diff （24 行未修改）
2016-11-06 20:56 – 20:56	雨蒼林	r282 – r284
顯示 diff （21 行未修改） *或者可以先略過這個問題，因為字的順序不會影響關鍵字搜尋的結果。 - j; + 網站要工程師嗎？
2016-11-06 20:56	(unknown)	r281
顯示 diff （24 行未修改）
2016-11-06 20:56	雨蒼林	r280
顯示 diff （20 行未修改）寫一個app在OCR前先將翻拍文件旋轉一適當角度或者可以先略過這個問題，因為字的順序不會影響關鍵字搜尋的結果。 + + j;
2016-11-06 20:17 – 20:23	Ti-Yen Lan	r277 – r279
顯示 diff （22 行未修改）
2016-11-06 14:02 – 15:22	Ti-Yen Lan	r44 – r276
顯示 diff （9 行未修改）申請API Key: https://developers.google.com/api-client-library/python/guide/aaa_apikeys + + 初步test: + 相對乾淨的文件：OCR結果 + 大部分的內容都可以正確被抓到。 + 含手寫內容的文件：OCR結果 + 如果手寫內容太潦草或模糊，則沒辦法被抓到。 + 翻拍時有陰影的文件：OCR結果 + 可抓到陰影部分的字，但因翻拍角度不夠水平，有一些字和文件內的順序不一致。 + 可能解決方案： + 寫一個app在OCR前先將翻拍文件旋轉一適當角度 + *或者可以先略過這個問題，因為字的順序不會影響關鍵字搜尋的結果。
2016-11-05 20:49 – 20:54	Ti-Yen Lan	r25 – r43
顯示 diff （4 行未修改） OCR with Python and Google Cloud Vision API reference: https://gist.github.com/dannguyen/a0b69c84ebc00c54c94d + repo: + https://github.com/tl578/g0v-nyc + + 申請API Key: + https://developers.google.com/api-client-library/python/guide/aaa_apikeys
2016-11-05 20:02 – 20:46	Hsin Hsiao	r8 – r24
顯示 diff 國家寶藏松 - 後端需求 - OCR with Python and Google Cloud Vision API: + + OCR SERVER + + OCR with Python and Google Cloud Vision API reference: https://gist.github.com/dannguyen/a0b69c84ebc00c54c94d
2016-11-05 20:02	Ti-Yen Lan	r7
顯示 diff 國家寶藏松 - 後端需求 + OCR with Python and Google Cloud Vision API: + https://gist.github.com/dannguyen/a0b69c84ebc00c54c94d
2016-11-05 19:57	Hsin Hsiao	r6
顯示 diff 國家寶藏松 - 後端需求 - - This pad text is synchronized as you type, so that everyone viewing this page sees the same text. This allows you to collaborate seamlessly on documents!
2016-10-30 10:09 – 10:10	Hsin Hsiao	r1 – r5
顯示 diff - Untitled + 國家寶藏松 - 後端需求 This pad text is synchronized as you type, so that everyone viewing this page sees the same text. This allows you to collaborate seamlessly on documents!
2016-10-30 10:09	(unknown)	r0
顯示 diff + Untitled + This pad text is synchronized as you type, so that everyone viewing this page sees the same text. This allows you to collaborate seamlessly on documents!