Readit News logoReadit News
binarymax · 6 years ago
The thing that I like about Duckling is that it is a rules based system, which can easily be interrogated. Model based text extraction is much harder to fix when there is a bug. I use Duckling as a service in value extraction from queries and content alongside a model based system for NER (such as spaCy). Using both together makes for more accurate enrichment in general (by cross referencing between the two for values, and adding exception rules)
ar7hur · 6 years ago
It's exactly how we used it internally for wit.ai

(Very small correction: Duckling is rule-based but uses a super simple Naive Bayes classifier to prioritize between the many potential parses produced by the rules -- we see it as a hybrid approach)

ar7hur · 6 years ago
Hello HN, I'm the original author of Duckling (with @blandinw). As usual, always happy to get feedback and suggestions.
2sk21 · 6 years ago
Interesting! When I worked at IBM, we evaluated Duckling (the Haskell version) for use in the Watson Assistant product but decided to write our own numerical quantity parser/interpreter. We used ANTLR and created context-free grammars as we found that we could improve both precision and recall substantially. Sadly not open source though.
nudpiedo · 6 years ago
I must say it looks very eat from the point of view of usability. Are the training data sets open? Do you see feasible for small app coders (who don’t have thousands of examples to train) to use Duckling as more or less NLP parser without getting too much deep into the NLP and AI theory?

Are the trained sets mean to be used by different client code or languages?

ar7hur · 6 years ago
Yes all the training data is in the repo.

Duckling is relevant to parse very structured language, typically temporal expressions (dates and times...). It relies on a mix of rules and machine learning. Rules and datasets for many (human) languages are available in the repo. You don't need a lot of data to add support for what you need, owing to this hybrid rules+ML approach (as opposed to just ML).

patapizza · 6 years ago
Hey, author of the Haskell re-write (https://github.com/facebook/duckling). We've implemented custom dimensions for extensibility (example: https://github.com/facebook/duckling/blob/master/exe/CustomD...).
jmiskovic · 6 years ago
Hi, thanks for dropping in. What's the status of Clojure implementation? Would you recommend new projects to use it? Is anyone looking at new/old issues? Are there potential new maintainers for Clojure version?
ar7hur · 6 years ago
The current Clojure version is quite stable, we used it at wit.ai/Facebook for several years before moving to Haskell.

I'd love to see somebody taking over and resuscitate it! One interesting direction could be to remove Java dependencies (mostly to Date) so that it's usable in ClojureScript. It would make a great JS library.

Deleted Comment

jcadam · 6 years ago
Out of curiosity, why the move from Clojure to Haskell?
blandinw · 6 years ago
I touched on that on Reddit: https://www.reddit.com/r/Clojure/comments/68r4lz/one_of_face...

TL;DR Haskell made more sense for us to scale with the number of requests (existing FB infra) as well as the number of engineers working on the project (type checking, etc).

l5t · 6 years ago
Scalability. More context on the move from the 2017 post https://medium.com/wit-ai/open-sourcing-our-new-duckling-47f...
notenoughbeans · 6 years ago
It knows about "Labor day" but not "Labour day".
binarymax · 6 years ago
Which language ruleset are you using? I imagine the latter is not in EN_US, but would be in EN_GB
woadwarrior01 · 6 years ago
It’s a Haskell library now. https://github.com/facebook/duckling
mark_l_watson · 6 years ago
I came here to mention the same thing. I experimented with the Clojure version a long while ago, and evaluated the Haskell version about a year ago for a project at work. Good stuff.
nudpiedo · 6 years ago
Glad to see there life beyond python for corporative AI usage.
dang · 6 years ago
Vosporos · 6 years ago
You're a bit late to the party…