Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Microsoft Purview Data Security Investigations supports Optical Character Recognition (OCR) text extraction for supported image file types. The following table lists the currently supported image file types and indicates if a file type is supported for file identification, metadata extraction, and OCR text extraction.
Image
Extraction of images is part of adding items to an investigation scope and automatic OCR processing isn't an additional charge for your organization.
| Mime type | File identification | Metadata extraction | OCR text extraction | Possible Extensions |
|---|---|---|---|---|
| image/bmp | Yes | Yes | Yes | .bmp |
| image/emf | Yes | Yes | Yes | .emf |
| image/gif | Yes | Yes | Yes | .gif |
| image/jpeg | Yes | Yes | Yes | .jpeg; .jpg |
| image/png | Yes | Yes | Yes | .png |
| image/svg+xml | Yes | Yes | Yes | .svg |
| image/tiff | Yes | Yes | Yes | .tif |
| image/vnd.dwg | Yes | Yes | Yes | .dwg; .dxf |
| image/wmf | Yes | Yes | Yes | .wmf |
Note
The OCR text extraction column indicates that you can extract text from these image formats when data is automatically vectorized. OCR text extraction occurs automatically during data preparation and the extracted text is vectorized for use in AI-based analysis tools.