I am trying to use tensorflow to recreate the results achieved in the PVNet project. Paper, github.

I've come across a problem that is preventing me from completing the project. I'm training my model, which provides two output tensors, one giving a set of vectors to keypoints, and the other which ideally classifies the object. Unfortunately, the classifier output does not seem to be able to pick up on the model.

E.G. Output prediction vs Target

The prediction was generated using the 'fullStvNet' model trained on the 'duck' class for 10 epochs. This model's weights are saved in my repo titled '10epoch_duck'.

The training code is in basicModel.py -> trainModel

def trainModel(modelFn, modelClass = 'duck', batchSize = 2, optimizer = 'adam', losses = {'activation_9':smoothL1, 'activation_10':tf.keras.losses.CategoricalCrossentropy()}, metrics = ['accuracy'], saveModel = True, modelName = 'stvNet_weights', epochs = 1): model = modelFn() model.summary() model.compile(optimizer = optimizer, loss = losses, metrics = metrics) sampleSize = len(os.listdir(os.path.dirname(os.path.realpath(__file__)) + '\\LINEMOD\\' + modelClass + '\\JPEGImages\\')) model.fit(data.trainingGenerator2(modelClass, batchSize), steps_per_epoch = math.ceil(sampleSize / batchSize), max_queue_size = 2, epochs = epochs) if saveModel: model.save_weights(os.path.dirname(os.path.realpath(__file__)) + '\\models\\' + modelName + '_' + modelClass)

The code responsible for loading and formatting the data is in data.py ->trainingGenerator2

def trainingGenerator2(model, batchSize, height = 480, width = 640, allPix = False, numClasses = 1): # take input image, resize and store as rgb, create training data basePath = os.path.dirname(os.path.realpath(__file__)) + '\\LINEMOD\\' + model masterList = getMasterList(basePath) i = 0 while True: xBatch = [] yCoordBatch = [] yClassBatch = [] for b in range(batchSize): if i == len(masterList): i = 0 random.shuffle(masterList) x = filePathToArray(basePath + '\\JPEGImages\\' + masterList[i][0], height, width) with open(basePath + '\\labels\\' + masterList[i][2]) as f: labels = f.readline().split(' ')[1:19] yCoordsLabels = np.zeros((height, width, 18)) # 9 coordinates #yClassLabels = np.zeros((height, width, 1)) # 1 class confidence value per model yClassLabels = np.tile(np.array([1, 0]),(height, width, 1)) if not allPix: modelMask = filePathToArray(basePath + '\\mask\\' + masterList[i][1], height, width) #showArrayAsImage(modelMask, 1, 'RGB') modelCoords = np.where(modelMask == 255)[:2] yCoords = modelCoords[0][::3] xCoords = modelCoords[1][::3] for modelCoord in zip(yCoords, xCoords): setTrainingPixel(yCoordsLabels, modelCoord[0], modelCoord[1], labels) #yClassLabels[modelCoord[0]][modelCoord[1]][0] = 1 yClassLabels[modelCoord[0]][modelCoord[1]] = np.array([0, 1]) xBatch.append(x) yCoordBatch.append(yCoordsLabels) yClassBatch.append(yClassLabels) i += 1 #print(i) yield (np.array(xBatch), {'activation_9': np.array(yCoordBatch), 'activation_10': np.array(yClassBatch)})

The rest of my code can be found on my github repo. Any help or advice is appreciated as I am completely stuck.