High-speed document scanning service

high-speed document scanner

Is your collection a good match for high-speed sheet feeding?

Source material characteristics

Loose sheets

  • Fasteners (e.g., staples, paperclips) need to be removed prior to scanning. Bulk fasteners like large binder clips can typically be removed at scan time.
  • No additional paper/items may be affixed to sheets. In cases where a Post-it (or similar affixed item) is found at scanning time, the scanning technician will have to hand-feed the page.
  • No extremely brittle pages with broken edges.
  • No deckled (rough and irregular) edges. Only clean regular edges can be auto-fed. In cases where an irregular edge fails, the operator will have to hand-feed the page. 
  • In some cases, barcodes will need to be removed or lined-out with a marker prior to scanning
  • Small variations in page size can be tolerated within a batch. In cases where size variations are extreme, the transitions need to be flagged prior to scanning.
  • Cockled, wrinkled sheets can be accommodated in most cases, as long as the scanner's vacuum transport is capable of sufficient flattening.
  • Collections where items suitable for sheet-fed scanning are mixed with a high-proportion of unsuitable items are ineligible for this service.

Sheet size

  • Max: 11" x 17"
  • Min: 2.5" x 3.5"

Paper weight range

Approximately, 16 lb (think, tracing paper) to approximately 53 lb (think, index card)

Note that thin papers need to be evaluated carefully as the scanner can only scan pages that have some directional stiffness.

Additional constraints

  • Black/very dark pages cannot be scanned. These are invisible to the scanner's sensors.
  • Documents/objects of less than 50 sheets cannot be scanned efficiently with this scanner. Collections of documents with few sheets each will be matched with another service.
  • Document batches may be scanned in simplex (front of sheet only) or duplex (both sheet sides) mode. Blank page-sides will be included.

Caution

When considering disbinding bound published materials for this service, be aware that text, charts, or images spanning across two page openings will likely result in lost content. This loss originates in the disbinding process, not the scanning.

Outputs

  • 8 bit per channel, 3 channel (RGB) color images
    • JPEG2000 format, lossless compression
    • JPEG2000 format, lossy compression (future service)
    • 300 ppi
  • 1 bit, bitonal images.
    • TIFF (G4 compressed)
  • OCR (uncorrected)
    • UTF-8 text, one file per corresponding page image.
    • UTF-8 text, ALTO format, one file per corresponding page image (future service)

Pricing

Scanning throughput varies widely based on number of pages per document, and level of advance preparation. Pricing is established based on project-specific characteristics, and deviations from default pricing are made when necessary.

Defaults:

  • Image capture: $0.15 each
  • Uncorrected OCR: $0.03 per image
  • DRS deposit: $0.00
  • Automatically generated structural metadata (METS): $0.00

Examples