What are your best resources (textbooks, papers, blog posts) around web data mining? More specifically, on extracting specific elements from a webpage e.g a product price, description etc... Side question: there is a whole range of approaches going from simple feature engineering on DOM tree nodes with standard classifiers to graph neural nets. Have any of you put these kinds of models into production? Any interesting tradeoffs to discuss?
Thanks!!