I am trying to implement a custom training loop for object detection using YOLOv8 (Ultralytics) and PyTorch. My goal is to fine-tune a pre-trained yolov8n.pt model on the Aquarium dataset, which ...
I’m having difficulty finding the hardware resource specifications for different LLMs and VLMs. The leaderboard at this link — https://huggingface.co/spaces/opencompass/open_vlm_leaderboard — includes ...
I want to implement in python some algorithms from a paper that allow for a pre-trained neural network to be modified (adding or removing neurons or layers) conserving (theoretically) the outputs of ...
I want to find a convolutional network with a large kernel (larger than 5x5 or 7x7). I want to perform kernel analysis, and to do this, I need to convert the model to the onnx format. I found ...
I have input shape to a convolution (50, 1, 7617, 10). Here, 7617 is word vectors as rows, and 10 is the number of words in columns. I want to convolve column-wise and obtain (2631, 1, 7617, 1), 1 ...
I am trying to install the GroundingDino as instructed in the README file of their official GitHub repo, but I am facing the error below:Obtaining file:///home/kgupta/workspace/Synthetic_Data_gen/...
I am training a LSTM model with data from yfinance. The process is really standard. I get the data with yf.download(ticker=ticker) where ticker='AAPL and do df.rolling(30, min_periods=1) to smooth the ...
I am using the MixStyle methodology for domain adaptation, and it involves using a custom layer that is inserted after every encoder stage. However, it is causing VRAM to grow linearly, which causes ...
So, I’m trying to understand why sometimes neural networks get stuck during training. I heard people talk about ‘local minima’ and ‘saddle points,’ but I can’t really picture them. I want to actually ...
I am trying to perform KFold cross-validation on a Keras model. The first fold runs exactly as expected, but from the second fold onwards the model doesn’t seem to reset. The training behaves ...
I am training a model using TensorFlow/Keras using TensorFlow 2.19.0/Keras 3.10.0. During training, I monitor nvidia-smi and top, and the system RAM and the GPU RAM increase during the training period....
I'm doing some experiments with Flax NNX (not Linen!).What I'm trying to do is compute the weights of a network using another network:A hypernetwork receives some input parameters W and outputs a ...
I'm building a neural network from scratch using only Python and numpy, It's meant for classifying the MNIST data set, I got everything to work but the network isn't really learning, at epoch 0 it's ...
I am trying to implement classification of ECG segments from PTB-XL database (https://physionet.org/content/ptb-xl/1.0.3/). The architecture of the model which I am using is:import torchimport torch....
No matter which input I give it after training, it still spits the class distribution.. whereas if I just remove the hidden layer and use a single layer nn, it works much better.I know the proper ...