Google Gemini, a multimodal AI by DeepMind, processes text, audio, images, and more. Gemini outperforms in AI benchmarks, is optimized for varied devices, and has been tested for safety and bias, adhering to responsible AI practices.
63 AI tools found
Explore our extensive collection of AI research papers and machine learning manuscripts (LLMs) from top academics and industry experts. Delve into the latest findings, breakthroughs, and peer-reviewed articles of 2023, providing a deep understanding of the ever-evolving AI landscape.
Google Gemini, a multimodal AI by DeepMind, processes text, audio, images, and more. Gemini outperforms in AI benchmarks, is optimized for varied devices, and has been tested for safety and bias, adhering to responsible AI practices.
Carbon is a unified API to connect external data to your vector databases. Build better, personalized AI applications.
Cerelyze - Enabling engineers to rapidly reproduce scientific research
However, due to the unavailability of experts in these locations, the data has to be transferred to an urban healthcare facility (AMD and glaucoma) or a terrestrial station (e. g, SANS) for more precise disease identification.
Eosinophilic Esophagitis (EoE) is a chronic, immune/antigen-mediated esophageal disease, characterized by symptoms related to esophageal dysfunction and histological evidence of eosinophil-dominant inflammation.
Event-based motion deblurring has shown promising results by exploiting low-latency events.
MS3D++ provides a straightforward approach to domain adaptation by generating high-quality pseudo-labels, enabling the adaptation of 3D detectors to a diverse range of lidar types, regardless of their density.
Granger causal inference is a contentious but widespread method used in fields ranging from economics to neuroscience.
Efficient RGB-D semantic segmentation has received considerable attention in mobile robots, which plays a vital role in analyzing and recognizing environmental information.
For the at most one change point problem, we propose the use of a conceptor matrix to learn the characteristic dynamics of a specified training window in a time series.
To balance efficiency and effectiveness, the vast majority of existing methods follow the two-pass approach, in which the first pass samples a fixed number of unobserved items by a simple static distribution and then the second pass selects the final negative items using a more sophisticated negative sampling strategy.
Our codec demonstrates the potential of specialized codecs for machine analysis of point clouds, and provides a basis for extension to more complex tasks and datasets in the future.
Deep neural networks are vulnerable to universal adversarial perturbation (UAP), an instance-agnostic perturbation capable of fooling the target model for most samples.
RFL means that recommender system can only receive feedback on exposed items from users and update recommender models incrementally based on this feedback.
The massive successes of large language models (LLMs) encourage the emerging exploration of LLM-augmented Autonomous Agents (LAAs).
Furthermore, we identify the aspects of deductive reasoning ability on which deduction corpora can enhance LMs and those on which they cannot.
Denoising Diffusion Models (DDM) are emerging as the cutting-edge technology in the realm of deep generative modeling, challenging the dominance of Generative Adversarial Networks.
Objective and subjective evaluations show that \\textit{Phoneme Hallucinator} outperforms existing VC methods for both intelligibility and speaker similarity.
Cross-lingual open information extraction aims to extract structured information from raw text across multiple languages.
Tracking 3D objects accurately and consistently is crucial for autonomous vehicles, enabling more reliable downstream tasks such as trajectory prediction and motion planning.
We propose a data augmentation strategy, named DFM-X, that leverages knowledge about frequency shortcuts, encoded in Dominant Frequencies Maps computed for image classification models.
In the second stage, an audio-driven talking head generation method is employed to produce compelling videos privided the audio generated in the first stage.
We propose a novel framework CipherChat to systematically examine the generalizability of safety alignment to non-natural languages -- ciphers.
To overcome the above issues, we introduce CycleAdapt, which cyclically adapts two networks: a human mesh reconstruction network (HMRNet) and a human motion denoising network (MDNet), given a test video.
Operations such as edge detection, image enhancement, and super-resolution, provide the foundations for higher level image analysis.
In this work, we propose a novel training mechanism termed SegPrompt that uses category information to improve the model's class-agnostic segmentation ability for both known and unknown categories.
Email platforms need to generate personalized rankings of emails that satisfy user preferences, which may vary over time.
Hyper-relational knowledge graph completion (HKGC) aims at inferring unknown triples while considering its qualifiers.
Medical systematic reviews can be very costly and resource intensive.