EvaluateReview (EvaluateReview v0.1.0) View Source
Documentation for EvaluateReview.
Link to this section Summary
Functions
cache review results
load cache review results
Uses Floki to recursively match a given list of selectors. Often times css selectors are used for many different tags on a page. The combination of several helps the user to narrow down their selection to a single tag.
Read Json from File
Scrape Dealer Rater Reviews
Scrape a list of urls
Classify overly positive reviews
Link to this section Functions
Specs
cache review results
TODO encode this as JSON rather than binary
credit to https://elixirforum.com/u/benwilson512
Caches review lists as binary data so as to avoid unnecessary web scraping and to minimize suspicion
Specs
load cache review results
credit to https://elixirforum.com/u/benwilson512
Specs
Uses Floki to recursively match a given list of selectors. Often times css selectors are used for many different tags on a page. The combination of several helps the user to narrow down their selection to a single tag.
Specs
Read Json from File
credit to https://elixirforum.com/u/idi527
Examples
iex> filename = "/tmp/test.json"
iex> EvaluateReview.read_json(filename)
Specs
Scrape Dealer Rater Reviews
This function attempts to scrape reviews from the passed in url and returns a list of tuples, the first element being the review itself and the second element containing the username of the reviewer.
The simple css selectors employed are .review-content for the content of the review itself, and the combination of .italic and .font-18 for the username of the reviewer. This is an intentionally chosen shortcut. A slightly more robust approach might use the .review-container selector instead since it would seem less likely to change. I found it relatively bloated and so opted for a quicker approach that felt more elegant.
A future approach might include user-defined selectors rather than hard coded ones, but as the use case is currently very narrowly defined (solely scraping reviews from deallerrater.com) this approach seemed unnecessarily complicated.
Examples
iex> url = "https://web.archive.org/web/20201127110830/https://www.dealerrater.com/dealer/McKaig-Chevrolet-Buick-A-Dealer-For-The-People-dealer-reviews-23685/"
iex> reviews = EvaluateReview.scrape(url, [])
iex> reviews |> Enum.with_index() |> Enum.each(fn {{a, b},_} -> IO.puts("review: #{a}, reviewer: #{b}") end)
Specs
Scrape a list of urls
Specs
Classify overly positive reviews
Takes a list of reviews in the format produced by EvaluateReview.scrape(url, [])
Produces a list of the top three offenders ordered by severity
Current criteria for a suspicious review is simply based on a count of the number of exclamation points included in the review
Tried passing the defaultSuspector function into suspect_reviews. Unfortunately, there's no way I could find to define function is this module AND make the function available in such a manner
https://elixirforum.com/t/proposal-private-modules-general-discussion/19374/154
As such, a user could define their own suspector functions and pass them to suspect_review, but I can't seem to define them within this module
Examples
iex> url = "https://web.archive.org/web/20201127110830/https://www.dealerrater.com/dealer/McKaig-Chevrolet-Buick-A-Dealer-For-The-People-dealer-reviews-23685/"
iex> reviews = EvaluateReview.scrape(url, [])
iex> top3 = EvaluateReview.suspect_reviews(reviews)
iex> IO.inspect(top3)