SymTax is a novel AI for citation recommendation. It mimics human behavior by using a "symbiotic" model and hyperbolic geometry to improve accuracy.SymTax is a novel AI for citation recommendation. It mimics human behavior by using a "symbiotic" model and hyperbolic geometry to improve accuracy.

How Symbiotic AI Can Find Your Paper's Next Great Citation

2025/08/26 16:44
4분 읽기
이 콘텐츠에 대한 의견이나 우려 사항이 있으시면 [email protected]으로 연락주시기 바랍니다

:::info Authors:

(1) Karan Goyal, IIIT Delhi, India ([email protected]);

(2) Mayank Goel, NSUT Delhi, India ([email protected]);

(3) Vikram Goyal, IIIT Delhi, India ([email protected]);

(4) Mukesh Mohania, IIIT Delhi, India ([email protected]).

:::

Abstract and 1. Introduction

  1. Related Work

  2. Proposed Dataset

  3. SymTax Model

    4.1 Prefetcher

    4.2 Enricher

    4.3 Reranker

  4. Experiments and Results

  5. Analysis

    6.1 Ablation Study

    6.2 Quantitative Analysis and 6.3 Qualitative Analysis

  6. Conclusion

  7. Limitations

  8. Ethics Statement and References

Appendix

Abstract

Citing pertinent literature is pivotal to writing and reviewing a scientific document. Existing techniques mainly focus on the local context or the global context for recommending citations but fail to consider the actual human citation behaviour. We propose SymTax[1], a three-stage recommendation architecture that considers both the local and the global context, and additionally the taxonomical representations of query-candidate tuples and the Symbiosis prevailing amongst them. SymTax learns to embed the infused taxonomies in the hyperbolic space and uses hyperbolic separation as a latent feature to compute query-candidate similarity. We build a novel and large dataset ArSyTa, containing 8.27 million citation contexts and describe the creation process in detail. We conduct extensive experiments and ablation studies to demonstrate the effectiveness and design choice of each module in our framework. Also, combinatorial analysis from our experiments shed light on the choice of language models (LMs) and fusion embedding, and the inclusion of section headings as a signal. Our proposed module that captures the symbiotic relationship solely leads to performance gains of 26.66% and 39.25% in Recall@5 w.r.t. SOTA on ACL-200 and RefSeer datasets, respectively. The complete framework yields a gain of 22.56% in Recall@5 wrt SOTA on our proposed dataset. The code and dataset are available at https://github.com/goyalkaraniit/SymTax.

\

1 Introduction

Citing has always been the backbone of scientific research. It enables trust and supports the claims made in the scientific document. The ever-growing increase in the amount of scientific literature makes it imperative to ease out the author’s task of finding a list of suitable papers to follow and cite (Johnson et al., 2018; Bornmann et al., 2021; Nane et al., 2023). Citation recommendation is such a process that helps researchers to be aware of the relevant research in respective domains. There are two different approaches to recommend citations: local (Dai et al., 2019; Ebesu and Fang, 2017; Huang et al., 2012; He et al., 2010), and global (Xie et al., 2021; Ali et al., 2021; Bhagavatula et al., 2018; Guo et al., 2017). Local citation recommendation is the task of finding and recommending the most relevant prior work, mainly corresponding to a specific text passage (also known as citation context), making it context-aware. On the other hand, global citation recommendation recommends a list of suitable prior art for the entire document, mainly given the title and abstract or the whole document. In this paper, we solve the task of local citation recommendation, which is more fine-grained and provides a solution to the actual challenge the author faces.

\ Figure 1: Proposed method consists of three essential modules. Prefetcher and Reranker takes query consisting of citation context, title, abstract and taxonomy of the citing paper as input. For each candidate paper (Ci), Enricher uses knowledge from citation network and Reranker generates the final top-K recommendations.

\ For example, consider the below citation excerpt:[2]

\ This can have extreme consequences in real-life scenarios such as autonomous cars CitX.

\ Examining the above context in isolation makes it challenging to predict the specific article cited at CitX. However, leveraging global information such as title, abstract, and taxonomy narrows down the search space while at the same time utilizing symbiotic relationship provides the model with an enriched pool of the most suitable candidates. Unlike ACL-200 and RefSeer datasets with curated contexts of fixed size, we curate richer contexts by incorporating complete information of adjoining sentences with respect to the citation sentence. To summarise, we make the following contributions:

\ • Dataset: We have constructed a dataset ArSyTa comprising 8.27 million comprehensive citation contexts across diverse domains, featuring richer density and relevant features, including taxonomy concepts, to facilitate the task of citation recommendation.

\ • Conceptual: We explore the concept of Symbiosis from Biology and draw its analogy with human citation behaviour in the scientific research ecosystem and select a better pool of candidates.

\ • Methodological: We propose a novel taxonomy fused reranker that subsequently learns projections of fused taxonomies in hyperbolic space and utilises hyperbolic separation as a latent feature.

\ • Empirical: We perform extensive experiments, ablations, and analysis on five datasets and six metrics, demonstrating SymTax consistently outperforms SOTA by huge margins.

\

:::info This paper is available on arxiv under CC by-SA 4.0 Deed (Attribution-Sharealike 4.0 International) license.

:::

[1] Accepted in ACL 2024

\ [2] Excerpt is borrowed from Towards Consistency in Adversarial Classification of (Meunier et al., 2022). The cited article is An analysis of adversarial attacks and defenses on autonomous driving models of (Deng et al., 2020).

시장 기회
플러리싱 에이아이 로고
플러리싱 에이아이 가격(SLEEPLESSAI)
$0.02719
$0.02719$0.02719
-6.49%
USD
플러리싱 에이아이 (SLEEPLESSAI) 실시간 가격 차트
면책 조항: 본 사이트에 재게시된 글들은 공개 플랫폼에서 가져온 것으로 정보 제공 목적으로만 제공됩니다. 이는 반드시 MEXC의 견해를 반영하는 것은 아닙니다. 모든 권리는 원저자에게 있습니다. 제3자의 권리를 침해하는 콘텐츠가 있다고 판단될 경우, [email protected]으로 연락하여 삭제 요청을 해주시기 바랍니다. MEXC는 콘텐츠의 정확성, 완전성 또는 시의적절성에 대해 어떠한 보증도 하지 않으며, 제공된 정보에 기반하여 취해진 어떠한 조치에 대해서도 책임을 지지 않습니다. 본 콘텐츠는 금융, 법률 또는 기타 전문적인 조언을 구성하지 않으며, MEXC의 추천이나 보증으로 간주되어서는 안 됩니다.

No Chart Skills? Still Profit

No Chart Skills? Still ProfitNo Chart Skills? Still Profit

Copy top traders in 3s with auto trading!