Autonomous Navigation of an Unmanned Aerial Vehicle (UAV) in a Simulated Canal Environment Using Deep Reinforcement Learning

SBASSE Home
Autonomous Navigation of an Unmanned Aerial Vehicle (UAV) in a Simulated Canal Environment Using Deep Reinforcement Learning

Post Date

Feb 27 2026

Tallat_Mahmood_MS_Thesis_(18060028).pdf14.71 MB

Autonomous Navigation of an Unmanned Aerial Vehicle (UAV) in a Simulated Canal Environment Using Deep Reinforcement Learning

Year

2020

Supervisor:

Dr. Abubakr Muhammad

Students:

Tallat Mahmood

MS/PhD

Reference / Filters

Electrical Engineering

The agriculture sector contributes a major impact on the GDP and the economy of Pakistan. The productivity of the agriculture sector highly depends on the irrigation system consisting of canals. An efficient irrigation system requires proper inspection and maintenance. Usually, the inspection of the canals is performed manually which is an infeasible, time-consuming, and laborious task that affects the growth of the agriculture sector. To deal with these issues, a possible solution is to perform autonomous inspection and monitoring of the canals. An autonomous inspection is performed by navigating the Unmanned Aerial Vehicles (UAVs) equipped with different sensors like LiDAR and stereo cameras. The existing autonomous navigation systems require a 3D map of the environment to find an obstacle-free path for UAVs. The 3D map and obstacle-free path planner algorithms require on-board high computational resources and memory. The UAVs lack these resources due to the small size, complex dynamics constraints, and limited power resources. To deal with these issues, we propose a deep reinforcement learning-based approach for autonomous canal navigation. In the proposed approach, we learn the action policy using observation (depth image) and reward (feedback) from the environment to navigate the UAV robustly without the 3D map of the canal. We have simulated a real canal-like environment in the Unreal Engine framework. We learn action policies using Deep Q Network (DQN) and Double Deep Q Network (Double DQN) algorithms to reach a waypoint (GPS coordinate) of the canal. The test analysis of the learned policies for navigation shows that the Double DQN outperforms for seen and unseen parts of the canal. So, the Double DQN is used to train the UAV over the complete canal. In this setup, the complete canal is divided into equidistant waypoints. The canal navigation is achieved by reaching these waypoints sequentially. The immediate next waypoint is considered as the goal point. The comprehensive analysis concludes that patch-wise training is required to navigate the complete canal. The experimental results of the proposed scheme show that the UAV can navigate the complete canal keeping a safe distance from obstacles without having a 3D map of the canal.

زراعت کا شعبہ پاکستان کی جی ڈی پی پر ایک بہت بڑا اثر ڈالتا ہے اور ملک کی معیشت میں اہم کردار ادا کرتا ہے۔ زراعت کے شعبے کی پیداوری کا انحصار نہروں (ڈیموں ، ندیوں اور بیراجوں کے بہاؤ) پر مشتمل ایک موثر آبپاشی کے نظام پر ہے۔ آبپاشی کے نظام کو موثر بنانے کے لیے مناسب معائنہ اور دیکھ بھال کی ضرورت ہوتی ہے۔ عام طور پر ، نہر کا معائنہ دستی طور پر کیا جاتا ہے جو کہ ایک وقت طلب اور محنتی عمل ہے جس سے زراعت کے شعبے کی نمو نما متاثر ہوتی ہے۔ ان مسائل سے نمٹنے کے لئے ، ایک ممکنہ حل یہ ہے کہ نہر کا خودکار معائنہ اور نگرانی کی جائے۔ لیڈار اور سٹیریو کیمروں جیسے مختلف سینسروں کی مدد سے بغیر پائلٹ ہوائی گاڑیوں (یو اے وی) کو نہر پر نیویگیٹ کر کے ایک خودمختار معائنہ کیا جاتا ہے۔ موجودہ خودمختار نیویگیشن سسٹم میں یو اے وی کے لئے رکاوٹ سے بغیر راستہ تلاش کرنے کے لئے ماحول کے تین جہتی (تھری ڈایمینشنل / تھری ڈی) نقشہ کی ضرورت ہوتی ہے۔ تھری ڈی نقشہ بنانے اور رکاوٹ سے پاک راستے تلاش کرنے کے منصوبہ ساز الگورتھم کو یو اے وی پر وسیع کمپیوٹیشنل وسائل اور میموری کی ضرورت ہوتی ہے۔ کم حجم ، پیچیدہ حرکات کی رکاوٹوں ، اور بجلی کے محدود وسائل کی وجہ سے یو اے وی کے پاس ان وسائل کی کمی ہوتی ہے۔ ان مسائل سے نمٹنے کے لیے ، ہم نہر پر یو اے وی کو خودمختاری سے چلنے کے لئے ڈیپ ری انفورسمنٹ لرننگ (ڈی آر ایل) پر مبنی تجویز پیش کرتے ہیں۔ مجوزہ تجویز کے زیرِ نظر میں ، ہم نہر کے تھری ڈی میپ کے بغیر یو اے وی کو نیویگیٹ کروانے کے لئے ماحول کی موجودہ اُبزرویشن (ڈیپتھ ایمیج, یو اے وی کی حالت) کا استعمال کرتے ہوئے ایکشن پالیسی سیکھتے ہیں۔ ایکشن پالیسی میں بہتری ہر ایکشن کے نتیجے سے ملنے والے انعام / جرمانے (ریوارڈ / پینلٹی) کی مدد سے ہوتی ہےـ ہم نے ان ریعل انجن (یو ای) سمیولیٹر اور مائیکروسافٹ کے ایرسِم یو ای پلگ اِن میں اصلی نہر نما ماحول تیار کیا ہے۔ ہم نے نہر کے ایک نقطہ (جی پی ایس کوآرڈینیٹ) تک پہنچنے کے لیے ڈیپ کیو نیٹ ورک (ڈی کیو این) اور ڈبل ڈیپ کیو نیٹ ورک( ڈبل ڈیپ کیو نیٹورک) الگورتھم استعمال کرکے ایکشن پالیسیاں سیکِھں۔ نیویگیشن کے لئے سیکھی گئی پالیسیوں کی جانچ پرتال کے تجزیہ سے پتہ چلتا ہے کہ ڈبل ڈی کیو این نہر کے دیکھے ہوئے(ایکسپلورڈ) حصوں کے لئے بہتر کارکردگی کا مظاہرہ کرتا ہے۔ لہذا ، پوری نہر پر یو اے وی کو نیویگیشن سکھانے (ٹریننگ) کے لئے ڈبل ڈی کیو این استعمال کیا گیا۔ اس سِٹ اپ میں جی پی ایس کوآرڈینیٹس کی مدد سے پوری نہر کے راستہ کو مساوی حصوں میں تقسیم کیا گیا ہےـ یو اے وی, یکے بعد دیگرِ آنے والے جی پی ایس کو آرڈینیٹ تک رسائ حاصل کرتے ہوے پوری نہر کی نیویگیشن مکمل کرتا ہے۔ اس سِٹ اپ کے جامع تجزیے سے یہ نتیجہ اخذ کیا گیا ہے کہ مجوزہ تجویز کے زریعے پوری نہر پر نیویگیشن کے لیے نہر پر حصہ وار ٹریننگ کی ضرورت ہے تاکہ یو اے وی پوری نہر کو تلاش(ایکسپلور) کر سکے۔ مجوزہ تجویز کے جامع تجزیے کے نتائج سے پتہ چلتا ہے کہ یو اے وی نہر کے تھری ڈی میپ کے بغیر رکاوٹوں سے محفوظ فاصلے کو ذہانت سے نیویگیٹ کرسکتا ہے۔