Dave Jarvis' Repositories

git clone https://repo.autonoma.ca/repo/segmenter.git

Removed hard breaks from text.

Authordjarvis <email>
Date2018-12-18 00:04:15 GMT-0800
Commit9deb89a931f7c9d740fd9267dbe124a828569f58
Parent7466ea7
README.md
# Word Split: Text Segmentation Tool
-Word Split is a Java application for Java software developers. The application
-provides a way to split conjoined words that lack punctuation into separate
-words. The software is intended to split database column names into their
-human-readable equivalent text.
+Word Split is a Java application for Java software developers. The application provides a way to split conjoined words that lack punctuation into separate words. The software is intended to split database column names into their human-readable equivalent text.
Word Split takes the following input files:
* a probability lexicon, one word and probability per line (CSV format)
* a list of conjoined phrases, one per line
-Word Split will use the lexicon to separate the list of conjoined phrases.
-The resulting segmented phrases are written to standard out.
+Word Split will use the lexicon to separate the list of conjoined phrases. The resulting segmented phrases are written to standard out.
## Contents
Delta2 lines added, 6 lines removed, 4-line decrease