What sources can I use to train my ChatGPT chatbot?

What sources can I use to train my ChatGPT chatbot?

Training your own chatbot on your content is very easy with Chatwith. You can use the following sources of information to train your chatbot:
  • Website URL: single pages, but also whole domains can be crawled for content.
  • Website sitemap: you can use your website sitemap too!
  • Files: various documents in common formats like PDF and DOCX are supported.
  • Youtube videos
  • Plain text
 
notion image
 

Additional information

Website URL

You can provide a link to a website and we’ll crawl it for you for all content. This means that we’ll find all links on the URL provided by you, so not only your main page is added as a source but also all related pages - automagically. Please make sure the pages are publicly accessible (eg. open in private browsing mode). Pages that require login cannot be read by us.

Sitemap

Sitemaps (eg. like this: https://chatwith.tools/sitemap.xml) can be used for easier source management. This way you can be sure all the pages on your website will be read by our system.

Files

You can upload files in a variety of formats:
  • PDF: .pdf
  • Word documents: .doc, .docx, .odt
  • Powerpoint presentations: .ppt, .pptx
  • Sheets: .csv, .xls, .xlsx
  • Plain, structured & rich text: .txt, .md, .rtf
 
💡
PDF - special note
Make sure that the text in your PDF can be selected. If the text can’t be selected, it cannot be read. You can use any standard PDF viewers (eg. MacOS Preview, Chrome browser, Adobe Acrobat etc) to verify this. If you upload a PDF with text that is not selectable, it will be rejected with an error or show very little training characters.

Youtube videos

You can use links to Youtube videos that support transcriptions. Please keep in mind that only English transcriptions will be fetched. Whole playlists or channels are not supported yet.

Dynamic data

You can also connect your chatbot to 5000 apps or your API using OpenAPI specification and let it access up-to-date information (eg. search a database). Learn more here.

What data formats are not supported yet?

  • Private Notion pages (public pages work!)
  • Google Drive or Docs (please download the files first)
  • Youtube channels or playlists
 
💡
Would you like to train your chatbot with data format not listed above? Please reach out to support@chatwith.tools, our team can gladly help you with custom training needs.