I would like to build a machine learning (Python) mechanism that will be able to build a large collection of text files:
a. categorize them according to defined categories,
b. in each category, catch interesting text fragments (plain text) and assign them to the appropriate fields.
The order concerns the implementation of the ML mechanism itself - indexing of files is already done, and the content of files is in the Postgres database. One record represents one file and in b. We refer to the fields in the given record.
It's what I care about:
- the ability to independently teach the algorithm by assigning records to individual categories and text fragments as those that are adequate for a given field (it can be based on simple UI, but this is not necessary),
- good documentation that will allow further development of the mechanism built.
Because I'm not proficient in the field of ML, I assume that the person who undertakes the order will suggest how to build and operate this solution. I am open to suggestions. The budget of the order to be determined.
I invite you to contact me.