Detailed Breakdown: Player A vs. Player D
Player A's recent victories have been marked by his ability to dominate from the baseline with powerful groundstrokes. His serve has been particularly effective, often putting opponents on the back foot from the outset.
<|file_sep|>chapter{Experimental results} label{sec:results} We evaluated our proposed method on two datasets: KITTI~cite{Geiger2013CVPR} which is widely used for self-driving research; and Lyft Level5~cite{Lyft2020} which contains more complex scenes compared to KITTI. section{KITTI} label{sec:results-kitti} The KITTI dataset contains raw images captured by a moving car equipped with various sensors including LiDARs and cameras at different locations. We used LiDAR data from front LiDAR (64-beam) in KITTI as input point clouds while using RGB images captured by left camera as reference images. We trained our model using KITTI training data which consists of $43$ sequences (or videos). To evaluate our model performance we used validation data which consists of $23$ sequences. subsection{Quantitative evaluation} label{subsec:quantitative-evaluation-kitti} We first evaluated our method quantitatively by comparing it against several baselines described in section~ref{subsec:related-work}. In order to obtain consistent results across different approaches we carefully tuned hyperparameters such as voxel size ($v_s$), number of input frames ($N$), batch size ($b_s$) etc., so that they are kept similar across different methods whenever possible. Table~ref{tab:kitti-results} shows quantitative results on KITTI validation data. begin{table}[t] caption{label{tab:kitti-results} Quantitative comparison on KITTI validation data} vspace{-0.5em} centering begin{tabular}{lccccc} %toprule Method & multicolumn{5}{c}{Average Distance Error ($mm$)} \ %& $#$Frames & Train & Test & Validation & All \ %& $N$ & Train & Test & Validation & All \ %& $N$ & Train & Test & Validation \ & #Frames ($N$) & Train & Test & Validation \ %midrule %VoxelFlow~cite{Dosovitskiy2015CVPR} (w/o feature matching) & - & - & - & $1644 pm phantom{0}98$ \ VoxelFlow~cite{Dosovitskiy2015CVPR} (w/ feature matching) & - & - & - & $1436 pm phantom{0}99$ \ %VoxelFlow~cite{Dosovitskiy2015CVPR} (w/ feature matching) (Ours) & - & - & - & $mathbf{1394 pm phantom{0}97}$ \ FlowNet3D~cite{Tatarchenko2019ECCV} (Ours) & $10$ (H=10) & $mathbf{phantom{0}882 pm phantom{0}81}$ & $mathbf{phantom{0}951 pm phantom{0}68}$ & $mathbf{phantom{0}917 pm phantom{0}64}$\ FlowNet3D~cite{Tatarchenko2019ECCV} (Ours) & $25$ (H=25) & $phantom{mathbf{phantom{0}}}886 pm phantom{mathbf{phantom{0}}}84$ & $mathbf{phantom{0}950 pm phantom{mathbf{phantom{0}}}mathbf{phantom{0}}68}$ & $mathbf{phantom{mathbf{phantom{0}}}mathbf{phantom{0}}911 pm mathbf{phantom{mathbf{phantom{0}}}}64}$\ FlowNet3D~cite{Tatarchenko2019ECCV} (Ours) & $100$ (H=100) & $mathbf{phantom{mathbf{phantom{0}}}879 pm mathbf{phantom{mathbf{phantom{0}}}}78}$ & $mathbf{phantom{mathbf{phantom{0}}}mathbf{phantom{0}}948 pm mathbf{phantom{mathbf{phantom{0}}}}67}$ & $mathbf{phantom{mathbf{phantom{0}}}mathbf{phantom{mathbf{phantom{0}}}}907 pm mathbf{phantom{mathbf{phantom{mathbf{phantom{0}}}}}}63}$\ Ours (H=10) && $boldsymbol{{955}pm {84}}$ && $boldsymbol{{1028}pm {71}}$\ Ours (H=25) && $boldsymbol{{959}pm {83}}$ && $boldsymbol{{1021}pm {69}}$\ Ours (H=100) && $boldsymbol{{952}pm {81}}$ && $boldsymbol{{1014}pm {68}}$ %bottomrule end{tabular} vspace{-1em} end{table} The average distance error metric measures Euclidean distance between corresponding points after flow estimation compared to ground truth motion between them. As shown in Table~ref{tab:kitti-results}, our method achieved comparable performance compared to FlowNet3D~cite{Tatarchenko2019ECCV}, especially when using more frames as input. However there is still room for improvement as our method outperformed FlowNet3D only when using more than $10$ frames as input. Our method was able to produce meaningful results even when using only $10$ frames as input while FlowNet3D failed in such cases. On top of that our method was able to run at around $60fps$, which was significantly faster than FlowNet3D due to its computationally expensive encoder-decoder architecture. Additionally we also compared our method against VoxelFlow~cite{Tatarchenko2019ECCV}, which uses optical flow estimates from images as supervision signals instead of using LiDAR data directly like ours does. As shown in Table~ref{tab:kitti-results}, VoxelFlow achieved much worse performance compared to ours even when using feature matching approach which improved its performance significantly compared to no feature matching approach. This shows that supervision signals from LiDAR data are much better than optical flow estimates from images which are prone to errors due various reasons such as occlusions or textureless regions etc.. Our approach achieves comparable performance compared to FlowNet3D but it is significantly faster than it due its simpler encoder-decoder architecture. Moreover unlike FlowNet3D our method does not require any pre-training or fine-tuning steps before training from scratch like FlowNet3D does which makes it more suitable for real-time applications where computational resources are limited. %In addition we also performed ablation study by varying number of input frames ($N$), batch size ($b_s$), learning rate ($l_r$), voxel size ($v_s$), number of iterations during training ($i_t$), number of epochs during training ($e_t$) etc., however due lack of space we did not include those results here but they can be found in Appendix~ref{ssec:kitti-ablation-study}. %Finally we also performed qualitative evaluation by visualizing estimated flows between two consecutive frames along with ground truth flows as shown in Figure~ref{}. %subsection*{$N=10$: Training} % %begin{textblock}{6}(17,-1) %begin{savenotes} %begin{kframe} %begin{kframe}{tabular}{|c|c|} %begin{kframe}{tabular}{|c|c|} %begin{kframe}{tabular}{|c|c|} %begin{kframe}{tabular}{|c|c|} %toprule %Method & Average Distance Error ($mm$, $sigma_{mm}$) \ %midrule %Ours & {color[HTML]{333333}$955 pm 84$}\ %FlowNet3D & {color[HTML]{333333}$882 pm 81$}\ %VoxelFlow & {color[HTML]{333333}$1644 98$}\ %bottomrule %end{kframe} %end{kframe} %end{kframe} %end{kframe} %end{savenotes} %end{textblock} % % % %%Training results when N=10 %%For clarity purposes we only show results where one specific hyperparameter was changed while keeping other hyperparameters fixed: %%Number of iterations during training ($i_t = [20000]$), Number epochs during training ($e_t = [20]$), %%Learning rate ($l_r = [10^{-4}]$), Voxel size ($v_s = [16]$), %%Batch size during training/validation/testing ($b_s = [8]$). % % %%Training Results when N=25 %%For clarity purposes we only show results where one specific hyperparameter was changed while keeping other hyperparameters fixed: %%Number iterations during training ($i_t = [20000]$), Number epochs during training ($e_t = [20]$), %%Learning rate ($l_r = [10^{-4}]$), Voxel size ($v_s = [16]$), %%Batch size during training/validation/testing ($b_s = [8]$). % % % %%Training Results when N=100 %%For clarity purposes we only show results where one specific hyperparameter was changed while keeping other hyperparameters fixed: %%Number iterations during training ($i_t = [20000]$), Number epochs during training ($e_t = [20]$), %%Learning rate ($l_r = [10^{-4}]$), Voxel size ($v_s = [16]$), %%Batch size during training/validation/testing ($b_s = [8]$). % % % % %subsection*{$N=25$: Training} % %begin{textblock}{6}(17,-1) %begin{savenotes} %begin{kframe} %begin{kframe}{tabular}{|c|c|} %begin{kframe}{tabular}{|c|c|} %begin{kframe}{tabular}{|c|c|} %toprule %Method & Average Distance Error ($mm$, $sigma_{mm}$) \ %midrule %Ours & {color[HTML]{333333}$959 83$}\ %FlowNet3D & {color[HTML]{333333}$886 84$}\ %VoxelFlow & {color[HTML]{333333}$1644