Photo by @SRRLH
 

Communities in IITJ TR

Select a community to browse its collections.

Now showing 1 - 1 of 1

Recent Submissions

Item
Role of Scene text in Image Semantics
(Indian Institute of Technology Jodhpur, 2022-08) Harit, Gaurav
Since the advent of the printing press, text has slowly made inroads into the world we have built for us. The symbolic nature of text allows it to explain ideas more succinctly. Thus scene text content is often naturally occurring in images ( street or storefront images). Further, they are also embedded into images to drive home clear takeaway points (e.g., printed posters, advertisement images). In both cases, though, they bring in crucial contextual information that aids in interpreting such images. However, despite this pervasion of scene text in our everyday images [Dey et al., 2021] and the rich information source they entail, early works in visual understanding tasks like Image Classification, Captioning, and Visual Question Answering (VQA) [Antol et al., 2015] did not leverage the scene text content of images. This can be attributed to the challenges of detecting and recognizing scene text in the wild. However, maturing research in Scene text recognition has improved their ability to read the text in natural images, thus making the scene text content more accessible. This easy accessibility of scene text content, coupled with the recent advances in multimodal architecturesHu et al. [2020], provides a unique opportunity to incorporate scene text into visual understanding tasks. As our first point of the investigation [Dey et al., 2021], we propose to jointly use scene text and visual channels for robust semantic interpretation of images. We not only extract and encode visual and scene text cues but also model their interplay to generate a contextual encoding with rich semantics. The contextual encoding thus generated is applied to retrieval and classification tasks on multimedia images with scene text content, to demonstrate its effectiveness. In the retrieval framework, we augment the contextual semantic representation with scene text cues to mitigate vocabulary misses that may have occurred during the semantic embedding. To deal with irrelevant or erroneous scene text recognition, we apply query-based attention to the text channel. We show that our multi-channel approach, involving contextual semantics and scene text, improves upon the absolute accuracy of the current state-of-the-art methods on Advertisement Images Dataset by 8.9% in the relevant statement retrieval task and by 5% in the topic classification task. Our results confirm our initial hypothesis that scene text plays an essential role in the semantic understanding of images. These results encourage us to extend our framework to more challenging tasks, like Text-VQA Singh et al. [2019a], that explicitly require us to read and reason with the scene text of an image. However, the scene text words come from a long-tailed distribution, giving such tasks zero-shot characteristics. We hypothesize that the zero-shot nature of these tasks can benefit from leveraging external knowledge corresponding to the scene text. The open-ended question answering task of Text-VQA often requires reading and reasoning about rarely seen or completely unseen scene text content of an image. We address this zero-shot nature of the task by proposing the generalized use of external knowledge to augment our understanding of the scene text. We design a framework Dey et al. [2022] to extract, validate, and reason with knowledge using a standard multimodal transformer for vision language understanding tasks. Through empirical evidence and qualitative results, we demonstrate how external knowledge can highlight instance-only cues and thus help deal with training data bias, improve answer entity type correctness, and detect multi-word named entities. We generate results comparable to the state-of-the-art on three publicly available datasets under the constraints of similar upstream Optical Character Recognition (OCR) systems and training data. Through our experiments, we observe that this external knowledge not only provides invaluable information about unseen scene text elements but also augments the understanding of the text in general with detailed verbose descriptions. Our knowledge-enabled model is robust to novel text, predicts answers with improved entity type correctness, and can even recognize multi-word entities. However, the knowledge pipeline is susceptible to erroneous OCR tokens, which can lead to false positives or complete misses. This also explains how our performance on the datasets is correlated with the particular OCR systems used. Our investigation highlights the challenges and benefits of incorporating scene text into image understanding tasks. We validate our various hypotheses through empirical evidence across five different publicly available standard datasets. We conclude with a discussion on the implicit bias in these datasets for scene text, and propose data augmentation and a novel training scheme to deal with it.
Item
Education for the Embodied Human: An Enquiry into Human Nature and Education
(Indian Institute of Technology Jodhpur, 2023-02) Hari Narayanan V
The present work is a philosophical inquiry concerning the interconnection between theories of human nature and education. It is often argued that the current mainstream education has an underlying dualistic assumption that the mind-body and the human-world are distinct and, therefore, do not value embodied forms of knowing. This dualistic notion creates a rift between learners and their environments, substantially impacting their learning. It also results in an exam-oriented and achievement-based education system that is not conducive to developing children's critical thinking and an exploratory mindset. One major factor that gives rise to this condition is an inadequate understanding of human nature. It is evident that any educational activity presupposes one or another conception of human nature because a particular philosophy of human nature shapes and influences a particular philosophy of education. Therefore, understanding human nature is of paramount importance for designing or transforming education. An examination of several theories of human nature reveals that they are mostly the result of philosophical speculation and have underlying dualistic assumptions. However, recent empirical studies in cognitive science suggest that human beings are fundamentally embodied and embedded in the world. Being embodied means the mind is not separate from the body and the world, but it is dynamically coupled with both. Now, if human beings are fundamentally embodied, then education should also acknowledge this fact in its practices. But even though there is increasing evidence available in support of embodiment, we do not sufficiently appreciate it in our day-to-day lives or in educational discourse. This is the result of various psychological, neurological, and socio-factors. To address this problem, a two-way approach is presented, which is termed as the outer and inner curriculum. The outer curriculum employs an "outside-in" approach, in which pedagogies are designed and imparted as per embodied principles. However, only changing external pedagogies will not help much without realizing our own embodied nature. For this purpose, an inner curriculum is required to make changes "inside out". The inner curriculum fundamentally helps us to realize our own embodied nature, which has got significant salutary effects. Therefore, the inner curriculum is seen as complementary to the outer curriculum. The core content of the inner curriculum is mindfulness meditative practices, which help us to become self-aware of our thoughts and sensations, which in turn helps to embrace our own embodied nature and inextricable relationship with the world. This kind of embodied approach to education, having a focus on both outer and inner curricula, helps to create a more democratic, collaborative, and holistic learning environment, thereby fulfilling the vision of educational thinkers such as John Dewey, Paulo Freire, and Jiddu Krishnamurti.
Item
From primary microcephaly-associated CPAP as centriole size/number regulator to microtubulestargeting novel chemotype
(Indian Institute of Technology Jodhpur, 2023-07) Singh, Priyanka
Centrioles are cylindrical microtubule-based structures, embedded in a proteinaceous matrix called pericentriolar material (PCM). Together, this structure is referred as the centrosome. In animal cells, the centrosomes play a crucial role in cellular functions such as cell division and motility. Abnormalities in centrosome-associated proteins have been linked to human diseases, including cancer and neurodevelopmental disorders. Mutations in the core centriole protein, Centrosomal P4.1-associated protein (CPAP), have been associated with primary microcephaly (MCPH6), a disorder characterized by reduced brain size and cognitive disability. This study focuses on understanding the impact of CPAP mutations on centrosome and spindle organization in primary microcephaly. Specifically, it investigates the effects of two MCPH-associated mutations, E1235V and D1196N, in the CPAP G-box domain. The study reveals that E1235V causes increased centriole length, while D1196N leads to an increase in centriole number. Interestingly, E1235V does not localize at the centriole, whereas D1196N maintains its centriolar localization despite reduced interaction with the upstream centriole protein, STIL, similar to E1235V. This suggests the involvement of an alternate route involving the proximal parent centriole protein, CEP152. Moreover, we demonstrate that centriole abnormalities result in multipolar spindle formation and decreased cell viability. These findings shed light on the importance of regions within CPAP outside the direct microtubule-interacting domains in influencing centriole organization, providing valuable insights into the molecular mechanisms underlying primary microcephaly. The second part of the thesis work explores the development of novel chemical scaffolds for chemotherapeutics. The cell division machinery, comprised of centrosomes and microtubules, is crucially regulated during the cell cycle. Dysregulation of these structures can lead to human diseases, including cancer. Paclitaxel, a microtubule-targeting anticancer drug, has clinical approval but faces challenges due to the development of resistance in many cancer types. Hence, there is a need to identify new chemical scaffolds for designing effective anticancer drugs. In this study, a novel S-aryl dithiocarbamate chemical scaffold is identified as a potent anticancer compound with promising pharmacophore properties. The lead compound exhibits an I*C_{50} of <0.5 µM in lung and cervical cancer cells. It stabilizes microtubules, resulting in p53-p21-dependent cell cycle arrest in the G_{2} / M stage and cellular apoptosis. Interestingly, the lead compound shows comparable docking parameters to paclitaxel in the taxol-binding pocket of ẞ-tubulin. These findings present a promising alternative scaffold that can be further modified to enhance efficacy and potency as an anticancer drug.
Item
Study of Organic and Quantum Dots-Based Resistive Memory and Synaptic Devices
(Indian Institute of Technology Jodhpur, 2023-09) Sahu, Satyajit
The data storage requirement in the digital world is increasing day by day with the advancement of the Internet of Things (IoT). The current generation of silicon-based memory technology is facing serious problems in terms of performance, data storage density, power consumption, data processing time, cost-effectiveness, and so on. Conventional flash memory is one type of non-volatile memory that relay on tunneling through the oxide layer, consumes high power, and response time is high. The hard disk drive (HDD) is a widely used memory in the current era, but the main disadvantage is finding the particular magnetic domain where the data is saved, which leads the memory response time to a few milliseconds. The limitations of conventional memories can be tackled by next-generation memories. Ferroelectric random access memory (FERAM), Phase change memory (PCM), Magnetic random access memory (MRAM) and Resistive random access memory (RRAM) are considered as next-generation memory and have the potential to solve the problem. Among the other memories, non-volatile RRAM is an option that provides high-density and low power data storage capabilities. The information is stored in terms of resistance, where the high resistance state (HRS) is 0 bit, and the low resistance state (LRS) is 1 bit. The computers are based on von Neumann architecture, where processor units are separated from the memory unit and connected via a data bus. This causes a delay in response time and cannot go beyond a certain size limit, called von Neumann bottleneck. The current resistive memory has the capability to solve the von Neumann bottleneck, where the memory can simultaneously process and store data similar to the biological brain. The small molecule-based RRAM device is a point of interest because of its capability to be used in high-density data storage devices. A small organic molecule 5-Mercapto-1-methyl tetrazole (MMT) has been used with a polymer poly (4-vinyl pyridine) (PVP) matrix for the active layer of RRAM device. The MMT molecule with a different weight ratio in PVP was studied for RRAM application which reveals the invariant RRAM property. The maximum on?off current ratio for all the devices is 105, suggesting that the MMT molecule does not show any change in its characteristic properties when surrounded by an insulating material. When the device was fabricated without the polymer matrix, the surface morphology of the device completely changed as it was filled with large holes. These holes provide short-circuited pathways for the current by forming the direct metal contact between the top and bottom electrodes. Size miniaturization of the electronic device can be done using organic small molecules as well as inorganic nanoparticle QDs. The synthesis of CdS QDs and the study of RRAM properties has been studied. Al/CdS+PVP/ITO like MIM structured device was fabricated which shows extremely good switching properties. The data retention capability of 60000 seconds and 300 endurance cycles were studied. The charge-trapping mechanism is associated with the RS property. With the development of artificial intelligence and ultra-high speed computing the solution of von Neumann bottleneck is needed. In this regards, resistive memory can provide solution as RRAM has the capability to act as biological synapse that can store, process and transfer data. This attracts researchers to study resistive memory devices. Here fabrication of a small organic molecule Trimesic acid (TMA) and PVP composite-based resistive memory device. It shows excellent resistive switching with a high on-off ratio, excellent stability and data storage capability. Pulse transient measurements on the device demonstrated the capability of neuromorphic computation. The gradual set and reset process and change of conductance with an applied pulse confirmed the neuromorphic application. Paired pulse facilitation shows that the device can behave like the human brain. The redox active molecule and its change in conformation are the reasons for the switching behaviour of the device th led to the neuromorphic application of the device. For an important application like RRAM, it is crucial to understand the mechanism in nanoscale and control the Resistive switching (RS) by various means. Different models have been proposed to explain the RS behaviour of the material. First, the electrode effect on the switching, which includes contact type, and charge trapping/de-trapping near the electrodematerial interface. The other mechanisms are conducting filament formation, electrochemical metallization, ionic diffusion, and oxidation-reduction of the materials. So, there are many disagreements on the proposed models of RS, and it requires understanding using different experimental techniques. STM is one of the best tools for understanding the surface property, as well as studying the local density of states (LDOS) of the material. So, using the scanning tunnelling microscopy (STM) technique to understand the RS in materials in the nanoscale range would be very helpful. The RS properties and capabilities of neuromorphic computing of single AgInS2 quantum dot with the help of STM and scanning tunnelling spectroscopy (STS) have been studied in this chapter. The bandgap of the material and its temperature dependency has been studied and it suggests a nonlinear and linear variation at lower and higher temperature than the Debye temperature respectively. The STS shows the change of conducting states after applying localized pulses. The devices made from the quantum dots replicate these properties as well. The neuromorphic application of the device was tested by using the pulse transient measurement that mimics the learning and forgetting of information through the gradual set and reset process. The localized ionic transport is involved in the RS mechanism.
Item
Development of metal oxide based formaldehyde (HCHO) sensors using laser ablated nanoparticles
(Indian Institute of Technology Jodhpur, 2023-07) Kumar, Mahesh; Singh, Jitendra
In this thesis, first time, we reported a state-of the art method for the synthesis of metal oxide nanoparticles in atmospheric air using laser ablation techniques for the rapid prototyping of formaldehyde (HCHO) gas sensor. Formaldehyde gas is most common indoor air pollutants which causes various adverse human health problems if its limit goes above 0.75 ppm as per OSHA (Occupational Safety and Health Administration (OSHA), USA) guidelines. It is also noticed that some dishonest fish merchants are using formalin solution (formaldehyde gas dissolved in water) to preserve freshly caught fish during their transportation to the fish selling market to prevent the spoilage. So, various health issues have been occurred due to the ingestion of formalin contaminated fish. Thus, we need a miniature, low cost, ultra-low sensitive formaldehyde sensor for the development of smart or IoT enabled portable system for the measurement of formaldehyde gas level in indoor area and as well, their presence in fish. Micro hotplate is an essential part of any metal oxide based gas sensors. Hence, in the first part of the study, co-planner Au microheater based gas sensor platform was fabricated by laser micropatterning using a 355 nm Q-switched solid state laser source. The heat distribution profile of the fabricated micro hotplate was observed via IR thermal imaging camera. Furthermore, the long-term reliability, power versus operating temperature of the microhotplate were systematically studied. In addition with this, heat distribuiton profile of a Nichrome heater based gas sensor platform having a hollow alumina tube on which two gold electrodes had been printed at each end, was also investigated. The next set of analysis, the formaldehyde gas sensing performance was studied using pristine SnO2 and ZnO metal oxide materials. Formaldehyde gas sensing behaviour was studied by depositing thin film of SnO2 layer onto the external surface of alumina tube based gas sensor platform. The sputtered deposited SnO2 thin film sensor exhibited gas response of 1.2 towards 1 ppm of formaldehyde vapor with a response time of ? 32 s and a recovery time of ?72 s at 300°C. Next, to explore the formaldehyde sensing capabilities of nanoparticles, we have synthesized pristine ZnO nanoparticles by scanning a high power laser beam on the top surface of ZnO pellet in open air atmosphere and the laser-ablated ZnO NPs were directly deposited onto the alumina tube based gas sensor platform. The gas-sensing properties of the ZnO NPs has been carefully investigated in the presence of formaldehyde gas molecules. ZnO NPs-based sensor exhibited the response of about 1.8 towards 50 ppm formaldehyde gas at 350°C with response time 25 s and recovery time 12 s. To further enhance the sensitivity and selectivity towards formaldehyde gas, we have fabricated n-ZnO/n-SnO2 n-n heterojunction by combined processes of physical vapor deposition (PVD) by sputtering SnO2 thin film on the alumina tube based gas sensor platform and decorated it with ZnO nanoparticles. After decoratif laser ablated ZnO nanoparticles on thin film SnO2 sensor, it exhibited high response of 20 towards 50 ppm of formaldehyde with quick response (4 s) and recovery time (30 s) at lower operating temperature (250°C) compared to that of pure SnO2. After obtaining good results from the previous investigations with heterojunction, the present research has further been extended. The p-type NiO NPs were synthesized in atmospheric air by laser ablation of cylindrical shaped solid Ni pellet. We have fabricated p-NiO/n-SnO2 p-n heterojunction via decoration of laser ablated NiO nanoparticles over sputtered deposited n-type SnO2 thin film. We have explored the formaldehyde sensing behaviour of NiO/SnO2 sensor and compared with pristine SnO2 senso. The NiO/SnO2 sensor exhibited higher response of about 29.8 towards 50 ppm formaldehyde with fast response and recovery time (3 s and 90 s) at lower operating temperature (about 210°C) with good selectivity. In the last part of the thesis, enhanced formaldehyde sensing mecsm of ZnO/SnO2 and NiO/SnO2 sensors has been described. From the experimental gas sensing performance data of NiO/SnO2 sensors, we have also extracted the various gas sensing parameters such as response time (??res), recovery time (??rec), surface coverage (??), adsorption (Ka) and desorption rate constant (Kd) using Langmuir gas adsorption-desorption model via curve fitting method.