V4.1.x Duplicate listings

Discussion in 'Modules / Plugins / Modifications' started by guillopuig, Nov 14, 2009.

  1. guillopuig Customer

    Is there a module available to not allow duplicate listings?
  2. Eric Barnes Guest

    When you say duplicate listings do you mean the same listing place twice by the same user? Or the same listing by different users?

    I am not aware of a plugin or module for either one but want to clarify.
  3. guillopuig Customer

    same listing by the same user.... Spam... something like what Craigslist uses.
  4. seymourjames All Hands On Deck

    I can see lots of issues here. What happens if the ads differ by a couple of letters? At what point do you say it is a true duplicate?
  5. guillopuig Customer

    works pretty good for craigslist.
  6. seymourjames All Hands On Deck

    I think you are missing the point. Saying something works for craiglist is of no relevance at all unless you can explain what they are doing specifically. What do you mean by duplicate? Do you mean an 100% duplication, a lot of duplication , kind of duplication, seems like there is some duplication, the coherence between the distribution of words in the ads is 0.98, the correlation is 0.8 or more, 60% of the words are the same, what exactly? It is not very good if it only blocks true duplicates and next to useless if SEO is your concern. I could just make ads, excuse the pun 'ad infinitum' by changing just the heading to title 1, title 2, title 3 ... Would craiglist filter those out?
  7. guillopuig Customer

    Cool answer! very entertaining!

    Well, one of the reasons I took on this new classifieds venture is Craigslist. The amount of time i invested on that page is very much out of the ordinary. Craigslist in particular is no competition in the geographical area were I live and target, therefore the initiative to offer something similar here presented a very solid opportunity. I chose 68Classisfieds over many other solutions as it offers a very appealing look and, above all, ease of use, not only for me but for the visitors.

    I had to be extremely, and I mean EXTREMELY, creative to be able to get some live ads in CL after I had posted similar ads. (I mean... How many different ways can you advertise the same product?) What ever they are doing to prevent duplicate ads by the same user or even different users (as I had an average of 250 accounts) at any given point) is working very well.

    I do not know what is the degree of difference (or similarities) between ads for it to be called a duplicate. I do know though, that a solution that offers duplicate prevention would become very handy, specially when sites get more popular.

    As you know, my site went live 2-3 days ago and with barely 100 users, duplicates are already starting to shop up. I can only predict it will become a major nuisance in a short time.

    I can only assume that (a module) could first take some of the following variables to analyze the ad:
    username
    geographical area (state, city, zip code)
    category
    subcategory
    Price
    maybe "uncommon words" like a particular name, neighborhood name etc

    To tell you the truth, with my lack of programming experience, I can only speculate as to what a programmer/coder can come up with in terms of this. On the other hand, I will do some research (maybe on the CL forums) and testings and let you know about my findings.

    If you can offer me some advice about that would direct me into finding better, more useful info for you to develop something like this (if you are interested), please let me know.
  8. seymourjames All Hands On Deck

    You begin to see the complexity behind this now. The real point is stopping a true duplicate probably isn't hard (not very useful) but stopping something that to the human reader would say is a basically a copy is quite difficult - it is actually called pattern recognition. Its also about the choice of criteria in terms of when you say - hey! that is a copy. Now I don't know what craiglist uses but I would be surprised if it is very sophisticated but on the other hand it might be be. They may be into some very sophisticated stuff like Google does to detect spamming websites.

    The bottom line is that you are forgetting that 68C requires an administrator to monitor ads not just for duplicates but all forms of abuse - it is not craiglist (costing large sums of money to run) and even they have many moderators I am sure. 68C nor craiglist are not completely "hands-off" systems where people just sit back and do nothing. I honestly see no competitive advantage that craiglist derives from such a feature unless it is very sophisticated and beyond anything you are likely to find here.

Share This Page