Core Parsing Methods
parse()
Parse local files, file-like objects, or raw bytes content.Files to parse. Can be:
- File path (string or Path object)
- Raw bytes content
- File-like object (BinaryIO)
- List of any of the above
Processing mode:
DEFAULT
or ADVANCED
Callback function to monitor parsing progress
Maximum time to wait for parsing completion (seconds)
Interval between status checks (seconds)
DocumentBatch
- Collection of parsed documents
parse_urls()
Parse documents from URLs.URLs to parse. Can be a single URL string or list of URLs
Processing mode:
DEFAULT
or ADVANCED
Callback function to monitor parsing progress
Maximum time to wait for parsing completion (seconds)
Interval between status checks (seconds)
DocumentBatch
- Collection of parsed documents
get_job_status()
Get the current status of a parsing job.The job ID to check status for
JobStatus
- Current job status information
Amazon S3 Methods
list_s3_buckets()
List available S3 buckets.S3BucketList
- List of available S3 buckets
list_s3_folder()
List contents of an S3 folder.S3 bucket name
Path within the bucket (empty for root)
Maximum number of items to return
S3FolderContents
- Contents of the S3 folder
parse_s3_folder()
Parse all documents in an S3 folder.S3 bucket name
Path within the bucket to parse
Processing mode:
DEFAULT
or ADVANCED
Callback function to monitor parsing progress
Maximum time to wait for parsing completion (seconds)
Interval between status checks (seconds)
DocumentBatch
- Collection of parsed documents
Microsoft SharePoint Methods
list_sharepoint_sites()
List available SharePoint sites.SharePointSiteList
- List of available SharePoint sites
list_sharepoint_drives()
List drives in a SharePoint site.SharePoint site ID
SharePointDriveList
- List of drives in the site
parse_sharepoint_folder()
Parse documents in a SharePoint folder.SharePoint site ID
SharePoint drive ID
Path within the drive to parse
Processing mode:
DEFAULT
or ADVANCED
Callback function to monitor parsing progress
Maximum time to wait for parsing completion (seconds)
Interval between status checks (seconds)
DocumentBatch
- Collection of parsed documents
Box Methods
list_box_folders()
List folders in Box.Parent folder ID (“0” for root folder)
BoxFolderList
- List of folders
parse_box_folder()
Parse documents in a Box folder.Box folder ID to parse
Processing mode:
DEFAULT
or ADVANCED
Callback function to monitor parsing progress
Maximum time to wait for parsing completion (seconds)
Interval between status checks (seconds)
DocumentBatch
- Collection of parsed documents
Dropbox Methods
list_dropbox_folders()
List folders in Dropbox.Dropbox folder path (empty for root)
DropboxFolderList
- List of folders
parse_dropbox_folder()
Parse documents in a Dropbox folder.Dropbox folder path to parse
Processing mode:
DEFAULT
or ADVANCED
Callback function to monitor parsing progress
Maximum time to wait for parsing completion (seconds)
Interval between status checks (seconds)
DocumentBatch
- Collection of parsed documents
Async Methods
All methods are available in async versions with theAsyncLexa
client:
Next Steps
Learn about error handling and client configuration for production use.