More on Patent Citations and the Impossibility Skew

This post adds somewhat to my prior one titled Citing References.

Citing References

A recent draft article by Professors Jeffrey Kuhn (UNC-Business) and Kenneth Younge (EPFL) along with former USPTO Economist (and former now professor) Alan Marco looks more in-depth at patent citations and arrived at a conclusion parallel to mine:

Specifically, we observe a dramatic increase in the number of citations generated per year and relate that change to a small proportion of patents flooding the patent office with an overwhelming number of references.

Read the article: https://ssrn.com/abstract=2714954.

The paper divides patents four categories according to the quantity of prior art citations.  Rather than call them “quartiles” the authors opt for names that relate to the examiner’s ability to review the corpus of submissions (given their time constraints):

  • 0-20 References – Routine
  • 21-100 References – Difficult
  • 101-250 References – Extreme
  • 251+ References – Impossible

Matching my parallel work, the authors write: “By 2014, patents that cite an “extreme” or “impossible” number of citations are responsible for more than 46% of all patent citations, even though they comprise less than 5% of all patents issued in that year.”  In my data – looking at patents issued Jan-Aug 2017 – I found that the skew has increased so that the “impossible” patents now cite half of all applicant-submitted patent citations.  [Note Here that Kuhn looks at citation of US patent documents only, while my approach considered all citations, but limited them to  those designated as applicant cited.]

Figure 1 from the article provides further information.

KuhnCitations

When focusing on those “impossible” patents, the team concluded that “the citations generated by highly citing patents appear to be particularly uninformative.”  This conclusion is based upon their conclusion that many of the cited references are dissimilar to the claimed invention.

Kuhn, et al., is directed at academic researchers who use patent citation in their empirical research on business and technological trends.  Thus, the conclusions drawn are not designed to either indict or improve the patent system – rather, the primary conclusion is that researchers should consider this skew when making assumptions about the importance of patent citations.  “The main contribution of this article is to provide prima facie evidence that the citation generating process is now generating significant measurement error for many academic studies.”

48 thoughts on “More on Patent Citations and the Impossibility Skew

  1. Does the number of references cited in a particular patent include NPL? Frequently I see the used abusively where a practitioner will cite every office action received from for every case they’ve written regarding some aspect of a particular technology for a client.

    1. Not sure why you label that as an abuse.

      If indeed the aspect of the particular technology is on point, then a colorable argument exists and such is not abuse, eh?

  2. Assuming for the moment that Examiner’s have access to modern technology, searching through all prior art or prior art in specific classifications, generally requires some keywords whose number and nature lean on the narrow side to keep irrelevant documents out of the results, and catch only a small percentage.

    Assuming also that the applicant cited references are bona fides relevant in the applicant’s view, then it would represent a smaller pool in which to search, affording the opportunity the use of keywords whose number and nature are relatively broad and catch a larger percentage.

    In essence, providing a large list is still helpful, it provides a smaller pool of more relevant documents for an Examiner to play with.

  3. At least some of the patents issued that cite a large number of references are patents that the patentee finds particularly valuable and is likely to enforce. Patent that are enforced are the ones where the prosecution history is closely scrutinized and attacked. One common basis for attack is to allege that the patentee failed to cite to the patent office some reference. Citing more references reduces the number of uncited references that could be used in such an attack. In these cases, the numerous references cited are meant to reduce the risk of loosing the patent.

  4. So if I understand figure one from the article, in 1980 there was less than 100,000 TOTAL back references for ALL patents, and the entirety of those references were of the “routine” category.

    Then ten years later, in 1990, there was still less than 400,000 TOTAL back references for ALL patents, and practically the entirety of those references were still of the “routine” category.

    Another decade later, in 2000, the TOTAL back references for ALL patents stood at about 1,500,000, but now the new category of “difficult” appears to take 40% of that growth, while the “routine” category still doubles.

    Patent grants for 1980: 112,379
    Patent grants for 1990: 176,264 (1.56 factor of 1980)
    Patent grants for 2000: 315,015 (1.78 factor of 1990, 2.80 factor of 1980)

    So that’s approximately one cited reference per granted patent in 1980….

    The thrust of the article here is that the viability of using backward cites has degraded over time.

    Maybe that’s not the right question.

    Maybe the right question is why is that measure even being considered as any type of value indicator?

    The indicator seems untrustworthy (and yes, likely confounded with any number of uncontrolled variables).

    Further, the numbers don’t add up. From 1990 to 2000, the factor growth was 1.78 (less than double) and yet BOTH the “routine” category doubles AND the new “difficult” category mushrooms out of nowhere to appear to be ALL of the 1990 level. It appears that the number of actual patents less than doubles but the reference base for back patent numbers more than triples.

    Simply on a “per patent” basis, the graph is not consistent with reality – something is wrong with the model for that graph.

  5. “…even though they comprise less than 5% of all patents issued in that year.” CONSTITUTE, not “comprise”. Sheesh, you’d think patent practitioners of all people would know the difference.

    Oh wait, none of the authors of the article is a patent practitioner.

  6. Meh I review several thousand refs per patent generally nowadays anyway. I don’t generally become agitated as long as the digital ocr thing for the patent refs will work to import them to EAST. Though I’ve only ever had one case that cited 750 refs. and maybe 2 others that did 250 or so.

    Still, there probably should be rulemaking put into place to make that nonsense get real. Typically in cases where there are a lot of cites like half are completely irrelevant to the topic of the application and the other half are just showing whatever basic device the applicant has made without whatever is in their “wherein” clause at the end of the claim (the jepson clause in disguise lol). So they’re citing like half irrelevant and the other half that all just show basically the same thing. Then maybe 1 or 2ish relevant references that either show the wherein clause, show something close to it, or show it attached to a slightly different device than the one the applicant is claiming. Basically like 20 refs probably should have been submitted, and the rest is just wasted time.

    1. Perhaps you could hazard a guess, 6. Is there any correlation between the strength or weakness of the “wherein” feature and the total number of references? As we all know, the best place to hide a leaf is in the middle of the Amazonian rain forest.

      Perhaps you could tell us, 6, how much longer it takes you, to get through 150 references to find the handful that are relevant to patentability, than to get through a 30 reference list. Perhaps 5 times as long?

      Lastly, with the PCT there is an X,Y,A scheme of labelling references. X for novelty, Y for obviousness and A for mere technological background. Why doesn’t the USPTO oblige applicants doing their duty of candor to label each of their references X, Y or A?

      1. That’s perhaps a bit of a misnomer there MaxDrei.

        The question really is NOT between “30” and “150,” as the examiner is responsible for the entire world of prior art.

        When viewed in the context of what is actually asked of examiners, even a list of “1,000” is merely a small, infinitesimal even, list of prior art that exists.

        Implying that a needle is being hidden in a haystack then is NOT helpful, and in fact is unhelpfully devisive.

        Those with an “anti-patent” lean can easily be seen to come out on a particular side of the present topic (sort of like the canary in the coal mine).

        1. “as the examiner is responsible for the entire world of prior art.”

          Only in the abstract, the technical responsibility per agreement is only for the art in the search. The office is in effect responsible for doing a search and then responsible for looking over that search. There are 0 assurances as to how good that search is.

              1. We’ve had this conversation before 6. What is required of you as an examiner is to examine properly under the law.

                That means a WHOLE lot more than what you are trying to say here.

                1. (in other words, your “but that’s what my SPE told me” line of thought is not a valid legal position – that line of thought went out way back in the Vietnam days of “but I was only following orders”)

        2. “The examiner is responsible for the entire world of prior art.”

          Well, in that case, it will take an infinite amount of time before you get your next OA. Let your clients know how you are serving their best interests.

          1. . Let your clients know how you are serving their best interests.

            missed this the other day…

            Your comment here is not only a non sequitur, it shows that you are STILL trying to push off onto others YOUR responsibilities.

            How in the world is the issue of YOU doing your job a reflection of how I might be serving my client’s best interests?

            Are my client’s best interests really served by being lax on the “Do it right the first time through” idea so that IF their granted patent is deemed worthy of being infringed, it can be easily ripped from them in a post-grant review?

            Regardless of that – the focus is on you and you doing your job (as opposed to you merely meeting the metrics of how you are doing your job).

            Please do not try to shuck that onto someone else.

        3. Also, the Examiner is not the only person responsible for making sure a good patent issues. The applicant is also responsible. I can’t just file crap for my taxes and expect the IRS to fix everything for me without any input from me. Or, if they do, it will certainly not be to my liking. Why should patent applicants expect that, with no help or input from them, they will somehow get the best quality patents? The best applications I’ve examined included input from the inventors/attorneys throughout the process in the form of interviews and actually working together to identify and best claim the invention.

          1. sure a good patent issues. The applicant is also responsible.

            See Tafas – and we are Specifically talking about those things that ARE your responsibility.

            Your analogy with the IRS – much like your other attempts – is simply logically flawed and not appropriate.

            Maybe instead of dodging the point directly presented, you recognize the validity of that point?

            1. I will admit the validity of your point regarding what “examination” entails, and admit that what is “required” of Examiners is impossible. Much like what is “required” of police is impossible, or what is “required” of the IRS is impossible. But like police or the IRS, Examiners are doing the best we can with what we have.

              1. Much like what is “required” of police is impossible, or what is “required” of the IRS is impossible.

                LOL – NO – your fallacy comparisons remain fallacies.

                This understanding seems to evade you. Your attempted comparisons simply are not on point.

                1. Nice non sequitur 6.

                  What part exactly do you think is in a “fantasy universe?”

                  Be specific, please.

                  Since I know that you will not be able to show any single thing that I have stated here to be false, your own comment belongs in a fantasy universe.

      2. “Perhaps you could hazard a guess, 6. Is there any correlation between the strength or weakness of the “wherein” feature and the total number of references?”

        Well if they have 800 references cited but only a few dance around the wherein and you don’t find it in the rest of your search then I’d hazard it’s pretty strong.

        But no, not at the outset of the search.

        “Perhaps you could tell us, 6, how much longer it takes you, to get through 150 references to find the handful that are relevant to patentability, than to get through a 30 reference list. Perhaps 5 times as long?”

        Depends on the limitations I’m looking for. If I’m looking for some material something is made out of that isn’t ez word searched or some tiny feature that I have to look through 30+ drawings in every ref then it takes much longer to look for something. If it’s a grand overall design that I’m looking for then I can do a 150 refs in not that long a time.

        “doesn’t the USPTO oblige applicants doing their duty of candor to label each of their references X, Y or A?”

        Recently I do similar anyway in the conclusion because my current boss is all upset about extra citations for some odd reason.

  7. Don’t see what the point is giving labels like extreme and impossible. The old timers flipped through hundreds of patents in the shoes (yes, by hand!!!), and could determine their relevance. I worry about the current crop of “word search” Examiner’s . . .

    1. “hundreds of patents ”

      Um we’re doing thousands now, in the same amount of time. And I’ve seen the old subclasses, it wasn’t always even hundreds.

    2. With the new CPC classification system, many of the applications I review, when I do a classification search, return, wait for it, OVER 30,000 REFERENCES. The classes are getting huge (sometimes the number doubles when I update my search 6 months later), and each application is being put in 5-10 classes. I haven’t done a “complete” classification search . . . .well, ever. The classification search is just a good start to help narrow down my text search.

      1. To help put that number in perspective, if I looked at each of those references (30,000) from the classifications that my application was given, and I spent 1 second on each, that would be OVER 8 HOURS STRAIGHT of looking at references. For 1 second each. I think that “extreme” and “impossible” are very apt desctiptors based on the Author’s framework of “given their time constraints.”

        1. I am curious exer, do you fully apply the law as delivered (some would say rewritten) by the Supreme Court and exam claims for obviousness based on the KSR directives (that is, beyond the classification system and to the “jigsaw puzzle” effect?

          Is not your job (as opposed to any metric of your job – and I do hope that you understand the difference) to exam “under the law,” or more specifically, under the rule of law (37 CFR 1.104)…?

          If the Supreme Court sets the standard for passing 103 in such a way that to pass, the whole jigsaw puzzle of multiple art units must be examined, why are you not doing so?

          And yes, that multiplier effect makes the “1,000” impossible categorization pale in comparison.

          1. I am examining the multiple areas of art. I use my best judgment to apply several filters (usually relevant text) to narrow the multiple areas of art down to the most relevant references. So yes, I “search” all those 30,000+ references, but I don’t read most of them. I think this is how most examiners do the work. If you want a more thorough examination, for us to read all 30,000 references each time, then either (1) you need to be willing to pay a lot more for examination, or (2) expect the better examiners to leave as the workplace and demands get more and more unreasonable, and you’ll be left with the worst examiners.

            1. Do not confuse the metrics of your job with the actual job.

              I want what the Office has set forth as what I am paying for. And yes, that DOES resolve done to you and why you were hired. How you are measured to do WHAT you are to do is just not the same as WHAT you are to do. That is really not that difficult of a concept.

              And that, by the way, this is part of why I provided you with 37 CFR 1.104.

              As to the distinction between your job and the job of the director, meh – you attempt a difference without a distinction. YOU ARE HIRED TO EXAMINE – and that clearly means that 37 CFR 1.104 applies to you.

              If you have a problem with examination, it is not the examiners’ fault,

              Absolutely false if YOU are not doing the examination that is required under the law. Sorry pal, but being an examiner means certain things – things that you do not get to pass the buck on.

              I “get” that you want to “pass the buck,” but too bad for you that you are on the bottom and you cannot pass it on and certainly, you cannot pass it on to the applicants.

              Now then, perhaps you want to return to my post and actually comment on 37 CFR 1.104 and what that means when the Supreme Court makes YOUR job that much more intensive….

              1. You could make the same argument for the police, and it doesn’t do anybody good there either. Law enforcement officers are hired to investigate crimes, so instead of investigating murders should they are chasing down every stolen TV and driver with a suspended license?

                You can’t have your cake and eat it too. If you want applications examined thoroughly as called for in 37 CFR 1.104 then be prepared to wait 10 yrs for a first action.

                Likewise, if you have a problem with the entirety of any system, you can’t blame any actor in particular. Doctors aren’t the problem with healthcare, teachers aren’t the problem with education, and examiners aren’t the problem with examination.

                What you’re suggesting is that any particular examiner take a stand against the system and do their job “right” (under the law), without regard for their advancement or continued employment. I assume that most examiners aren’t willing to stake their career on that, and expecting them to is nearsighted at best.

                1. Your argument is a fallacy, as the police are not engaged on a voluntary basis and have not accepted a fee for a service (as advertised).

                  You can’t have your cake and eat it too.

                  NO ONE is asking for that.

                  On the other hand, I sure as H better get what I pay for – under the law and under the rule of law that examiners are bound to.

                  No one said anything about waiting or not – why introduce a strawman like that?

                  Likewise, if you have a problem with the entirety of any system, you can’t blame any actor in particular.

                  This is a non sequitur. The problem under discussion is NOT some “entirety of any system” and in fact, the point presented has been carefully chosen to be very particular – and particular to the actor identified.

                  and do their job “right” (under the law), without regard for their advancement or continued employment.

                  Your metrics – and whether or not you advance or even have continued employment are YOUR problem. You have zero right to attempt to make them my problem.

                  This is simply NOT a matter of being “nearsighted.” You do NOT get to slough off at my (and my client’s) expense.

                  To do so – and then on top of it to pretend that it is somehow the “right thing to do” is immoral.

                  If in fact, no one sticks around to do the actual job, that THEN would be a huge benefit by forcing the system to focus on patent examination quality in the first instance.

              2. I pay for the police through my taxes. I sure as H better get every single criminal act stopped and investigated and referred for prosecution. That’s what I pay for.

                Hyperbole much?

                1. Oh, you pointed out it was a fallacy, so any of my arguments are just dismissed? Did you not notice that I am providing arguments responsive to yours? I think you would make just the kind of examiner you seem to think we all are. Lazy, disingenuous, and lacking sound logic.

                2. An argument based on a known fallacy is dismissed.

                  Yes, if ALL of your arguments are so based, then ALL of them will be dismissed.

                  And no, arguments based on fallacy are – by definition – NOT responsive.

                  The lesson to be learned is to NOT base your arguments on fallacy. Instead, you appear to want to use the Malcolm Accuse Others name-calling meme (it does not work for him, why in the world would you think that it would work for you?

                  (this is not rocket science)

          2. I also can’t get over how you want to define my job. IT IS NOT MY JOB. IT IS THE JOB OF THE DIRECTOR/COMMISSIONER that you keep pointing out. The director/commissioner has delegated certain responsibilities to examiners. I do what I was hired to do. If you have a problem with examination, it is not the examiners’ fault, it is the director/commissioner that you have a problem with, and how he/she is delegating work.

        2. “that would be OVER 8 HOURS STRAIGHT of looking at references. ”

          Which is why there needs to be a time adjustment if the office still wants complete searches. But instead they seem to want to fall back on word searches, but of course they get resistance from SPE/primaries because they know that sht is incomplete and a lot of SPEs etc. really want the search done correctly.

          1. Note that the 8 hours is for 1 second per reference. Even at high speed, I need at least a couple seconds to flip the first few figures to see if it looks relevant. And that’s not an option when I’m looking more for a concept than a physical structure.

            If I can narrow down a search using classification and text to fewer than 200 references, I can give the time needed to those references to get the most relevant 3 or 4, and then actually read those 3 or 4. If there are still features missing between the prior art and the claims, I can do a very focused search on those features, including IEEE, google, and other databases. To me, that is a much better use of 8 hours of searching, rather than staring blankly at 30,000 patent cover pages.

            1. “Note that the 8 hours is for 1 second per reference”

              I know already I’ve done the math many times. I routinely do 3k+ refs so I’m familiar with how fast the search has to be going or else I’m “in the hole” on time.

            2. ” To me, that is a much better use of 8 hours of searching, rather than staring blankly at 30,000 patent cover pages.”

              As I’m sure you know, it depends on the case.

            3. “I can give the time needed to those references to get the most relevant 3 or 4”

              Yeah probably the hardest thing about the explosion of art isn’t even having to do another 2k refs, it’s that the extra refs now have 20 “most relevant refs” that then need to be gone through with a fine tooth comb, each of them have their own merits and a decision needs to be made as to which is best for each claim. Sometimes I’ve started to where I just start rejecting claims with one ref I know will get some, and then move to the next ref I know will get some that I think is 2nd best on perusal and go from there because otherwise it’d take all day just to figure out which of the 20 is “the best”.

  8. Timewise I can look at roughly 5 Patent references for every one NPL. The office doesn’t ocr the NPL or even put the title in the system. They are cumbersome to deal with logistically and a pain to red through. A good study would separate the analysis to account for both. I’m sure the office now has data as part of the time analysis.

Comments are closed.