Awesome work!!!! curious how you handle paragraphs and niche language like federal regulations.
What are your favorite ways to do sentence and paragraph embeddedings and is there a framework you like where you can tune to custom data? Do you find fine tuning your embedding model helpful?
Thanks! The post doesn’t cover fine tuning of the model which would be absolutely necessary (but out of scope for the post). Nils Reimers (the author of SBERT) has been on a speaking circuit covering Generative Pseudo Labelling to handle the vocabulary gap of new domains that a pretrained sbert model hasn’t seen yet.
In your review, did you suggest the definition and explanation that they used? In this situation, would have an acknowledgment at the end have been enough? In my mind, it seems like you all had a conversation and the authors took up your suggestions as the reviewer.
No, I did not suggest the definition and explanation as content for them to use. I was trying to explain a concept that they discussed incorrectly multiple times in the paper. It is an advanced concept that might not even appear in graduate-level courses on the subject, so I can understand why they did not understand it fully. That said, I did not give them permission to copy my words there. If there are any particular changes I want the authors to do I put them in quotes. This wasn't in quotes. It was an explanation for their own benefit so that they can correct the mistakes in the paper (by re-writing it).
Once I re-read the submission I wanted to reject it immediately, but I realized that I should get a second opinion first. So I contacted the editors, who agreed that it was blatant plagiarism. Hence, they rejected the paper once I recommended rejection in my second review. So this wasn't just a conversation where I made some suggestions and the authors used them. Even the editors thought it was plagiarism once they looked at it.
An acknowledgment would be impossible because the review was single-blind. The reviewers knew the identities of the authors but not the other way around. What the authors should have done was just re-phrase where they used the term in the paper. They didn't even need to copy my explanation, to be frank. The paper would worked fine without the paragraph they copied. If they just re-phrased the relevant parts no other changes would have been needed and this whole thing could have been avoided.
In the absence of an explicit directive or request from you, given that the authors are from a different culture, how do you expect them to know what was required by them?
I don't mean to be snarky or accusative. Your comment was thoughtful, articulate and detailed, which tells me you are a sophisticated communicator.
It's a fair question. They were foreigners submitting to an American journal, so there is always the possibility for some sort of cultural misunderstanding in addition to any language difficulties. Nonetheless, the journal's submission process provides authors with a page listing ethical standards they have to follow, and it says that plagiarism of any form is not allowed. In fact, this journal's particular set of standards even mentions that authors cannot copy anything obtained during the peer review process without the "explicit permission" of the reviewer. So I just expect them to follow the rules that they were told about when they submitted the paper.
So, I understand how it's plagiarism, but I'm still not following why your suggestion, with the goal of helping them get their paper accepted, wouldn't be acceptable to copy/paste. It was to them and only to them, so it's not like it's a piece of substantial work from another team. Seems to be an extreme form of following the letter of the rule, and not the spirit of the rule. But, I'm not an academic so I don't really understand this sort of lack of discretionary allowance..
I'm fully on board with fighting plagiarism down to that level.
But that said, I've often times wondered if this requirement of having to "rewrite in your own words" may do a lot of harm too. It obfuscates that things that people are talking about are actually exactly the same, or make it fuzzy what the exact differences are.
In a particular academic CS area I've witnessed people reproduce again and again the essentially identical description of setting and assumptions, but in being afraid of plagiarism accusations, they over and over re-formulate things which made it nonobvious that things are the same as from other authors or even from their own earlier work.
My understanding is that something like the following happened:
1. authors submit a paper with expository sections about (eg) some materials being flammable and others inflammable
2. reviewer tries to explain that they have incorrectly understood the meaning of the terms, explains the meaning carefully and maybe suggests the terms they might mean.
3. Authors copy in the explanation and maybe replace incorrect usages with weird tortured phrases
4. Rejection
Obviously this description reads a little bit silly and things were probably more nuanced in practice. I think I’m probably also being uncharitable towards the authors in the example.
Acknowledging anonymous reviewers is common in my (erstwhile) field. “An anonymous reviewer suggests the following definition of…” I have to say that it seems odd to me to regard this as plagiarism.
Not 100% sure but I believe the word confidential implies that the review should only have been read by the editor(s) and not passed on to the authors.
A review is the written feedback authors receive from the journal reviewer. The reviewer can recommend that the authors revise and resubmit, based on the review comments. Usually the review is not published with the final piece, which is what was meant by “confidential review”.
Thank you for sharing and releasing usable code! Do you know if this would work for GPU based applications? Tensorflow models that are trained on a GPU, for example?
For GPU's, there's a couple of different things that you might want to do.
You can use existing tools within LLVM to automatically generate GPU code out of existing code, and this works perfectly fine, even running Enzyme first to synthesize the derivative.
You can also consider taking an existing GPU kernel and then automatically differentiating it. We currently support a limited set of cases for this (certain CUDA instructions, shared memory etc), and are working on expanding as well as doing performance improvements. AD of existing general GPU kernels is interesting [and more challenging] since racey reads in your original code become racey writes in the gradient -- which must have extra care taken to make sure they don't conflict. To my knowledge GPU AD on general programs (e.g. not a specific code) really hasn't been done before, so it's a fun research problem to work on (and if someone knows of existing tools for this please email me at wmoses at mit dot edu).
Do you know of any interesting natural ismorphism between the categories you define in your paper and the category of finite-dimensional hilbert spaces? Curious if you have thought about applications to categorical quantum mechanics.
Natual isomorphisms with FDHilb are very difficult to get here: Different flavors of Petri nets present different flavors of FREE monoidal categories. "Free" here means that we have categories that satisfy exactly the equations that are needed to be (symmetric, commutative) monoidal, nothing more. Instead, FDHilb is compact closed, and even more, hypergraph. This means that it has a lot more structure beyond monoidality: It has products (that are actually biproducts), cups and caps (because it is compact closed), etc. So there is no way to generate this kind of stuff from one of our nets: FDHilb has waaay more equations than just a monoidal cat. What you can get, tho, is functors from our categories to FDHilb. This is what "freeness" means. :)
In https://arxiv.org/abs/1805.05988 we were able to tweak the definition of Petri net a bit to let it generate free compact closed categories, and I feel this is the best we can do.
The kind of graphical gadget that generates FDHilb (in the sense that the graphical calculus is sound and complete wrt FDHilb) is called ZX calculus (or one of its equivalent variants, such as ZW). It took roughly 10 years to prove that ZX is complete wrt FDHilb! In any case, a string diagram in ZX calculus looks like a hypegraph with extra properties and equations. But you lose the dynamic interpretation of tokens moving in the net, there are no tokens in ZX!